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1.  INTRODUCTION 


Foliage  presents  a  type  of  terrain  characteristic  that  can  obstruct  information  required  by  the 
warfighter.  Traditional  remote  sensing  offers  rather  limited,  if  any,  capabilities  to  acquire  data 
under  the  canopy.  Recently,  state-of-the-art  LiDAR  technology,  in  particular  waveform 
processing,  can  provide  not  only  foliage  penetration  but  can  also  support  better  object 
identification  and  material  classification  under  the  canopy.  To  make  use  of  this  relatively  new 
technology,  not  yet  operational  in  the  Army’s  surveillance  and  reconnaissance  practice,  this 
project  aims  at  contributing  too  much  needed  research  necessary  to  bridge  the  gap  between  the 
potential  of  an  active  sensor  technology  and  the  implementation  of  deployable  systems.  The 
proposed  approach  is  based  on  combining  the  use  of  targets  and  materials,  assessed  by  full 
waveform  LiDAR,  to  understand  and,  consequently,  develop  the  acquisition  strategy,  processing, 
and  product  generation  necessary  to  best  exploit  the  mission.  Our  objective  is  to  model  full 
waveform  LiDAR  acquisition  parameters  designed  around  various  canopy  and  foliage  types 
(e.g.,  single  to  multi-  storied  canopies,  temperate,  tropical,  sub-tropical)  that  are  of  major  interest 
from  the  Army’s  operational  standpoint.  Matching  canopy  type  to  LiDAR  acquisition  would 
more  effectively  drive  any  given  sortie  and  optimize  the  mission  acquisition  and  support  the 
Army’s  field  operations,  including  planning,  battlefield  command  and  logistics  in  general. 


2.  RESEARCH  OBJECTIVES 

With  significant  advancements  in  LiDAR  technology,  the  information  content  available  to 
applications  is  rapidly  growing.  In  particular,  waveform  processing  offers  a  new  opportunity  for 
surface  material  characterization  and  subsequently  object  classification.  The  primary  goal  of  this 
research  project  is  to  analyze  the  waveform  potential  with  respect  to  object  material 
identification  and  classification  under  various  canopy  conditions  that  are  important  to  support  the 
Army’s  field  operations,  including  planning,  battlefield  command  and  logistics.  The  research 
objectives  span  over  three  major  fields,  including  (1)  full  waveform  LiDAR  processing  in 
general,  (2)  the  use  of  LiDAR-specific  targets,  and  (3)  full  waveform  application  in  forestry. 
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The  review  on  current  waveform  processing  methods,  LiDAR-specific  targets  and  forestry 
applications  clearly  indicates  that  while  research  is  active  in  all  fields,  these  works  are  generally 
disconnected,  and  there  is  rather  limited  effort  devoted  to  combining  these  three  activities  to 
exploit  the  full  potential  of  waveform  information  extraction.  The  actual  objectives  are  the 
followings: 


1 .  Characterize  LiDAR  waveform  with  respect  to  ARO-specific  requirements,  including 
information  extraction  in  general  to  support  tactical  identification,  localization,  and 
tracking,  and  surveillance  and  reconnaissance. 

2.  Extend  investigation  (1)  into  the  use  GmAPD  sensing  (Geiger-mode  avalanche  photo¬ 
diode);  depending  on  data  availability. 

3.  Create  a  material  and  terrain  surface  model  database  to  establish  empirical  or,  if 
possible,  analytical  relationship  to  LiDAR  waveform;  this  would  be  based  on 
analyzing  field  data,  including  the  use  of  dedicated  targets  and  reference  materials. 

4.  Develop  methods  based  on  (3)  to  process  waveform  data  in  real-time  and  post¬ 
processing  mode  to  support  near  real  time  decision-making  for  the  Joint  Operating 
Environment. 

5.  Provide  performance  metrics  for  the  waveform-based  terrain  characterization, 
including  benchmarking  of  the  developed  methods  and  identifying  the  material  and 
surface  classes  that  can  be  reliably  extracted. 

6.  Investigate  the  tradeoff  between  on-board  processing,  transmission,  centralized 
processing,  and  dissemination;  develop  recommendation  for  system  developments 
and  deployment. 


The  above  tasks  include  a  balanced  amount  of  algorithmic  research,  initial  implementation, 
testing,  data  acquisition  (jointly,  by  ARO  and  other  government  agencies),  data  analysis, 
software  developments  and  technical  report  preparation.  Most  of  the  algorithmic  developments 
are  implemented  in  the  Matlab  environment,  while  some  programs  may  be  compiled  with 
Microsoft  Visual  C++  on  the  Windows  platform.  The  format  of  LiDAR  data  includes  both 

9 


Exploitation  of  Full- waveform  LiDAR  to  Characterize/Exploit  under  Canopy  Targets  -  Foliage  Penetration  (FOPEN) 


manufacturer-specific  as  well  as  the  LiDAR  data  exchange  format,  i.e.,  the  industry  standard 
LAS  format. 


3.  INCIDENCE  ANGLE  DEPENDENCE  STUDY 

This  research  task  is  aimed  at  improving  our  understanding  of  the  impact  of  the  incidence  angle 
of  the  laser  beam,  as  it  is  backscattered  from  the  target,  on  full  waveform  data.  In  addition,  a 
secondary  aspect  of  the  study  is  to  investigate  the  material  dependency  of  waveforms. 

3.1  Background 

The  relationship  between  transmitted  and  received  photons  and  the  physical  environment  is 
described  by  the  following  equation: 

Ns a  R)  =  Nl(Xl)  *  p a h,  *  ^2  *  T &L.  R)T(X,  R)  *  fi(X,  WG(R)  +  nb 

where  Ns (A,  R)  is  the  number  of  received  photons  from  distance  R,  NL(XL)  is  the  number  of 
transmitted  photons,  /?  (A,  XL,  9,  R)  is  the  volume  scatter  coefficient  at  distance  R  for  incidence 
angle  9,  A R  is  the  thickness  of  the  range  bin,  A  is  the  aperture  of  the  sensor,  T(XL,  R)T(X,  R)  is 
one-way  rate  of  the  loss  photons  while  it  is  transmitted  from  laser  source  to  target  and  from 
target  to  receiver  in  distance  R,  /i(A,  XL)  is  the  system  optical  efficiency,  G(R)  is  the  geometric 
factor  of  the  optics  and  NB  is  the  photon  counts  from  background  radiation.  From  our 
perspective,  the  p  (X,  XL,  9,  R)  part  of  the  equation,  called  the  probability  to  be  scattered,  is  of 
importance.  The  expression  is  a  function  of  9,  which  is  the  incidence  angle,  R  is  the  distance 
between  the  instrument  and  the  target,  A  is  the  incoming  wavelength,  and  XL  is  the  outgoing 
wavelength.  The  construction  of  the  equation  above  suggests  that  the  impact  of  incidence  angle 
and  the  intensity  of  the  materials  may  be  detected. 

To  determine  the  theoretical  impact  of  the  incidence  angle  on  waveform,  consider  that  the  distance 
between  the  laser  source  and  the  target  is  R  =  300  m,  the  angle  between  the  object  and  the  X  axis  is  a  = 
45°  and  the  beam  divergence  is  yu  =  0.3  mrad  see  Error!  Reference  source  not  found..  In  this  special 
case,  when  the  laser  beam  is  nearly  perpendicular  to  the  X  axis,  the  incidence  angle  is  the  same  as  a  (for 
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general  case,  see  Figure  l).Note  these  parameters  reflect  the  measurement  arrangement  of  the  Riegl 
VZ400  laser  scanner,  used  to  collect  data  for  this  investigation. 


pr 


Figure  1  Impact  of  incidence  angle 

The  first  photons  are  scattered  back  from  point  Pf,  while  the  last  ones  come  back  from  point  Pi. 
Since  the  backscattering  takes  more  time  than  the  pulse  width,  by  T,  the  resulting  waveform  is 
stretched.  The  footprint  of  the  laser  beam  in  R  distance  is: 


F  ~  fiR  =  0.3  mrad  *  300  m  =  9  cm 

If  the  shape  of  the  transmitted  waveform  was  a  Gaussian-like  function,  after  the  backscatter,  the 
width  of  the  function,  at  least,  must  have  increased.  The  quantitative  value  of  this  change  can  be 
easily  calculated.  The  geometrical  distance  between  points  Pf  and  Pi  in  Y-axis  direction  is: 

dp  =  cos(a)  *  F  =  sin(45°)  *  9 cm  =  6.36  cm 

The  waveform  is  a  function  of  the  time,  thus  to  determine  the  travel  time  of  the  pulse  on  the  T 
distance  is: 


2  *  At  = 


dv  6.36  cm 

_JL  =  2  * _ 

c  299  792  458  m/s 


0.42  ns 
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Since  the  pulse  travels  forth  and  back,  there  is  multiplication  by  2.  When  the  laser  beam  is 
perpendicular  to  the  target  surface,  the  incidence  angle  is  close  to  zero.  Error!  Reference  source 
not  found,  for  the  changes  of  width  as  the  function  of  incidence  angle  in  the  left  side  and  the 
shapes  of  waveforms  in  different  incidence  angles.  Figure  2b  clearly  shows  that  for  the  relatively 
short  range  and  laser  beam  narrow  convergence  angle  the  impact  is  small. 


(a)  (b) 

Figure  2  Time  delay  as  a  function  of  the  incidence  angle  (a),  and  changes  of  the  shape  of  the  Gaussian  waveform  due 

different  incidence  angles  (b) 


3.2  Test  Dataset 

In  cooperation  with  the  USACE  ERDC  Remote  Sensing  and  Fluorescence  Spectroscopy  Lab,  a 
data  collection  campaign  was  conducted  at  the  OSU  West  Campus  on  October  15,  2012.  The 
OSU  GPS  Van  was  outfitted  with  different  targets  and  the  Riegl  VZ-400  laser  scanner  of  the 
USACE  ERDC  mobile  measurement  system  was  acquiring  laser  data  in  different  configurations; 
vehicles  and  scanner  are  shown  in  Figure  3.  The  geographical  location  of  the  field  survey  is 
shown  in  Figure  4.  The  laser  surveys  included  three  areas  with  four  data  collections,  as  listed  in 

Table  1  and  shown  in  Figure  4;  the  target-based  data  collection  was  interrupted  by  rain,  so  the 
data  for  the  targets  was  acquired  in  two  sub-sessions.  Besides  the  usual  system  check,  the 
shakeup  test  provided  a  short  range  survey  of  the  targets  on  the  GPS  Van,  see  Figure  5.  ScanPos2 
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and  ScanPos3  datasets  contain  the  measurement  from  different  types  and  orientations  of  the 
target  discs.  Finally,  ScanPos4  includes  scans  from  two  different  types  of  building:  a  parking 
garage  with  horizontal  extent  and  an  office  building  with  more  vertical  extent.  In  these  cases  the 
walls  of  the  building  is  considered  for  analyzing  the  impact  of  the  incidence  angle. 


(b) 

Figure  3  USACE  ERDC  mobile  measurement  system  with  Riegl  VZ-400  laser  scanner  (a)  and  OSU  GPSVan  with  targets  (b) 
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Figure  4  Field  survey  locations,  OSU  West  Campus;  shakeup  area  is  marked  in  green,  survey  range  with  targets  is  marked  by 
red  line,  and  survey  of  man-made  objects  area  is  marked  by  yellow 


Table  1  Description  of  the  datasets 


Name 

Description 

Time  (EST) 

Marked 

ScanPosl 

Shakeup  test  at  CFM 

13:45:29-13:50:39 

Green 

ScanPos2 

Surveying  disc  targets  (session  1) 

14:31:54-14:54:41 

Red 

ScanPos3 

Surveying  disc  targets  (session  2) 

17:20:19-17:55:31 

Red 

ScanPos4 

Surveying  buildings 

18:17:19-18:18:43 

Yellow 
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Figure  5  Shakeup  are  point  cloud 


The  circular  targets  installed  on  the  GPS  Van  were  covered  by  six  different  materials  to  support 
the  investigation  of  waveform  dependency  on  physical  properties.  The  different  coatings  are 
shown  in  Figure  6,  and  they  represent  several  materials  of  interest. 


50  cm 

< - > 


Figure  6  Target  materials 

The  data  preprocessing  primarily  included  the  extraction  of  all  the  relevant  information  to 
waveform  from  the  raw  measurement  files.  The  direct  access  to  the  data  was  provided  through 
the  Riegl  software  toolkit,  RiWaveLib  library.  Note  USACE  ERDC  provided  all  the  raw 
measurement  files  as  well  as  the  point  clouds  in  LAS  format  later.  In  addition,  the  GPS  Van 
location  and  orientation  was  computed  to  provide  accurate  georeferencing  for  the  targets.  The 
overall  workflow  of  this  preprocessing  is  shown  in  Figure  7. 
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Figure  7  Workflow  of  the  raw  data  processing 

WFM  files  and  the  index  files  are  provided  from  the  measurements,  and  the  coordinates  of  the 
points  can  be  calculated  with  the  WfmCoordCalc  program  developed  by  OSU.  The  Riegl  VZ-400 
scanner  uses  low  and  high  channel  units  for  digitizing  the  waveforms,  and,  thus,  waveforms  from 
the  WFM  files  are  derived  from  low  or  high  or  both  channels.  In  our  data  processing,  first  the 
low  level  channel  is  used  when  available.  If  only  the  high  channel  is  available,  the  high  channel 
intensities  are  converted  to  low  channel  intensities.  For  this  reason,  the  gain  between  the  two 
channels  had  to  be  determined,  and  thus  this  function  can  be  used  to  convert  from  high  channel 
to  low  channel  intensity.  Note  that  Riegl  provides  no  information  on  the  relation  between 
channel  intensity  values.  Based  on  test  data,  linear  regression  between  the  relative  gain  and  high 
channel  intensity  was  established,  as  shown  in  Figure  8a.  In  the  subsequent  processing,  the  following 
equation  was  applied  for  scaling  the  high  channel  intensities  to  low  level  intensity  domain: 

j  _  Ifiigh 

low  ~  0.006781497  +  0.000240794  * ~U~h 
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Relative  channel  gain  vs.  high  power  intensity 


Low  power  vs.  high  power  channel  intensities 


Figure  8  Relative  channel  gain  and  high  channel  intensities  (a),  actual  and  computed  low  and  high  channel  intensities  (b) 


The  differences  between  the  measured  and  computed  intensities  can  be  seen  in  Figure  8b.  Note 
that  the  fitting  is  not  satisfactory  enough,  and,  thus,  this  determined  correlation  was  only  used  for  the 
coordinate  calculation,  and  it  was  not  directly  applied  to  the  processing  of  waveform  intensities. 
In  the  subsequent  calculations,  as  mainly  low  channel  we  used,  though  the  coordinate  calculation 
high  channel  data  was  also  used  after  transformation.  After  finishing  coordinate  calculation,  the 
CRD  file  contains  the  coordinates  of  the  waveforms. 


3.3  Processing  ScanPosl  and  ScanPos2  Datasets 

The  GPSVan,  equipped  with  six  target  discs,  see  Figure  3b,  was  positioned  at  a  distance  from  the  Riegl 
VZ-400  station  of  about  330  meter.  Unfortunately,  the  digitizer  uses  a  2  ns  sampling  rate  that  is 
below  of  the  normal  1  ns  value,  widely  used  in  airborne  LiDAR,  making  the  measurement  of  the 
impact  of  the  incidence  angle  on  waveform  difficult.  The  targets  were  scanned  in  1 1  different 
orientations  of  the  van,  representing  a  broad  range  of  the  angle  of  incidence.  Note  the  laser 
scanner  station  was  in  a  fixed  position  during  the  scans  and  only  the  van  did  move  to  change 
orientation.  The  intensity  values  of  a  sample  point  cloud  obtained  from  the  targets  at  close  to  zero 
incidence  angle  is  shown  in  Figure  9.  While  keeping  the  vehicle  center  at  the  same  location  and  just 
changing  the  orientation  of  the  van  was  attempted,  there  was  a  minor  deviation  in  the  vehicle  center  as 
shown  in  Figure  10. 
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Figure  9  Intensity  values  at  near  zero  incidence  angle 


Figure  10  Various  positions  and  orientations  of  the  van  in  the  XY  plane 
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Though  the  GPS/IMU  system  of  the  van  provided  highly  accurate  georeferencing  for  the  targets, 
as  a  test,  the  directions  of  the  discs,  i.e.  the  van  orientation,  was  also  determined  from  measured 
point  cloud.  In  this  exercise,  two  methods  were  used:  (1)  grouping,  and  (2)  eigenvector  based 
solution.  In  the  case  of  grouping  method,  the  dataset  is  divided  into  two  parts  by  the  indexes  of 
the  point,  and  then  the  angle  between  the  two  centers  of  the  groups  is  determined  in  the  XY 
plane,  defining  the  direction  of  the  van.  The  eigenvector  based  method  first  computes  the 
covariance  matrix,  and  then  solves  for  the  eigenvalues.  The  direction  of  the  greatest  eigenvalue 
approximates  the  direction  of  the  van,  which  is  identical  to  line  defined  by  the  targets.  The  overall 
results  with  remarks  are  listed  in  Table  2.  Note  that  in  most  cases,  a  180  degree  difference  can  be 
detected  between  the  grouping  and  the  eigenvalue  solutions.  This  is  caused  by  the  fact  that  the 
direction  of  the  largest  eigenvector  shows  the  major  extension  of  the  point  set,  while  the 
direction  obtained  by  the  group  method  depends  on  the  direction  of  the  scan.  With  the  center 
point  of  the  point  cloud  of  the  targets  and  the  direction  of  the  targets,  the  incidence  angle  can  be 
easily  calculated;  Figure  1 1  shows  the  principle  of  the  calculation. 


Figure  11  Calculation  of  the  incidence  angle 
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Table  2  Direction  of  the  disks,  determined  from  measurements 


ID 

Solution 

from 

group 

method 

[degree] 

Solution 

from 

eigenvector 

method 

[degree] 

Final 

direction 

(P> 

[degree] 

Angle 

of 

center 

(a) 

[degree] 

Angle  of 
incidence 
(X) 

[degree] 

Orien¬ 

tation 

Remarks 

172120 

-30.1 

149.7 

149.7 

69.0 

9.3 

\ 

Few  points 

172411 

151 

150.9 

150.9 

69.1 

8.2 

\ 

172807 

-28.8 

151 

151 

69.0 

8 

\ 

173132 

133.9 

133.5 

133.5 

69.3 

25.8 

\ 

A  disk  from  the 
center  is  missing 

173523 

-61.1 

118.6 

118.6 

69.3 

40.7 

\ 

A  disk  from  the 
center  is  missing 

173842 

99.1 

98.9 

98.9 

69.4 

60.5 

i 

A  disk  from  the 
center  is  missing 

174216 

83.7 

-96.2 

83.8 

69.4 

75.6 

| 

174655 

83.3 

76.4 

76.4 

69.4 

83.0 

/ 

Large  difference 
of  the  solutions 

175044 

-22.5 

157.7 

157.7 

69.0 

1.3 

\ 

175325 

167.4 

69.7 

69.7 

69.0 

89.3 

/ 

Very  few  points, 
large  difference 
of  the  solutions 

175531 

-22.5 

157.7 

157.7 

69.0 

1.3 

\ 

3.3.1  Point  filtering 

The  point  filtering  was  used  to  determine  the  points  backscattered  from  each  disk.  Since  the 
target  disks  are  in  a  plane,  positioned  slightly  in  front  of  the  van,  the  point  cloud  can  be  easily 
thresholded  to  an  envelope  containing  all  the  points  of  the  six  targets.  Also,  the  gaps  between  the 
disk  can  be  used  to  separate  the  initial  point  cloud  into  six  groups.  Next,  fitting  planes  are 
calculated  for  all  the  six  targets.  In  the  datasets  the  standard  deviation  of  the  distances  from  the 
plane  was  ~6-8  cm,  and  then  those  points  are  selected  which  are  within  that  distance  from  the 
fitting  plane.  After  filtering,  these  points  are  rotated  to  the  XZ  plane,  as  shown  in  Figure  12.  Filtered 
points  on  the  fitted  plane  are  shown  in  Figure  13.  The  points,  which  lie  within  0.25  m  (that  was  the  radius 
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of  the  discs)  from  the  center,  are  selected  and  they  are  subject  of  all  the  subsequent  analysis,  as  there  is  a 
high  confidence  that  these  point  are  from  the  target  disc. 


Initial  point  cloud  Filtered  point  cloud 

Figure  12  Point  cloud  of  a  disk  target  (red)  with  fitting  plane,  points  within  1  cm  distance  from  plane  (blue),  points  after 

rotation  to  XZ  plane  (green) 


Figure  13  Points  within  25  cm  from  the  center 
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3.3.2  Basic  characterization  of  waveform  groups 


The  visualization  of  the  waveforms  form  the  six  targets  is  provided  in  Figure  14.  Note  that  the  materials 
with  strong  reflective  properties  can  easily  differ  from  the  ones  with  non-reflectivity  properties.  Other 
material  properties  cannot  be  recognized,  and  thus,  suggesting  that  materials  should  be  classified  by 
reflectivity  on  this  dataset.  Consequently,  the  materials  can  be  grouped  by  reflectivity;  using  three  classes, 
the  materials  are  grouped,  as  listed  in  Table  3. 


Figure  14  Full  waveforms  grouped  by  materials  (blue),  and  average  waveforms  (red) 


Table  3  Coarse  target  reflectivity  grouping 


Material 

Class  by  reflectivity 

Retro  reflective 

1 

Wood 

3 

Reflective  metal 

2 

Fluffy  plastic 

3 

Card  board  paper 

3 

Painted  wood 

3 
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Waveforms  from  all  groups  with  respect  to  the  incidence  angle  are  visualized  in  Figure  15.  Notice  that 
target  3  is  not  used  in  this  analysis,  as  the  highly  reflective  metal  material  has  a  mirror-type  of  reflection; 
no  returns  exist  if  there  is  any  incidence  angle.  Figure  16  illustrates  incidence  angles  where  no  data 
available  from  target  3  (marked  by  red  circle). 


Figure  15  Full  waveforms  from  all  targets  grouped  by  incidence  angle  (blue),  and  average  waveforms  (red) 
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Figure  16  Directions  illustrating  when  no  waveforms  are  obtained  from  target  3  (highly  reflective  metal) 


3.3.3  Gaussian  waveform  decomposition 

Gaussian  waveform  decomposition  is  the  most  widely  used  waveform  processing  method.  The  idea  is  that 
if  the  outgoing  laser  pulse  has  a  Gaussian  shape,  then  the  backscattered  pulse  should  have  similar  shape 
too.  Or  for  sparse  vertical  structures,  such  as  vegetation,  the  returning  waveform  is  composed  of  several 
reflections,  and  thus  the  waveform  is  the  sum  of  several  pulses  of  Gaussian  shape.  Once  the  waveform  is 
decomposed,  all  the  shapes  will  represent  an  object  surface/boundary,  and,  in  addition,  the  shape 
parameters  can  be  used  for  further  investigation,  such  as  correlation  the  surface  geometrical  or  material 
properties.  Since  the  outgoing  pulse  is  rarely  of  perfect  Gaussian  shape,  and  during  the  backscattering  the 
shape  of  the  reflected  pulse  can  be  distorted,  generalized  Gaussian  shapes  are  frequently  used  in 
waveform  processing.  In  theory,  the  shape  of  the  outgoing  pulse  can  be  used  for  decomposition,  which 
leads  to  an  ill-posed  deconvolution.  In  this  investigation,  a  generalized  Gaussian  shape  model  is  used,  as 
described  by  the  following  equation: 


/(x)  =  mi(m) 


atan(a(x  —  t)) 
n 


e 


b 
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where  t  (time)  is  the  translation  of  the  function,  a  is  the  skewness  parameter,  b  is  the  flattening,  s  is  the 
width  (i.e.  the  sigma),  mi(m)  is  the  scaling  variable  (the  magnitude  of  the  function).  Figure  17  show 
example  of  shape  functions  with  various  parameterization. 


Magnitude  parameter  Width  parameter 


Flattening  parameter 


Skewness  parameter 


-3 


-2 


-3 


-2 


-1 


Figure  17  Generalized  Gaussian  shape  functions 
The  result  of  a  typical  waveform  fitting  is  shown  in 


Figure  18.  The  discrete  full  waveform  data  is  represented  by  a  polyline  marked  with  blue  color.  Note  that 
in  our  tests,  hard  surface  targets  were  used,  so  only  single  returns  are  expected.  In  fact,  the  multiple  peak 
waveforms,  very  rare  cases,  were  filtered  out  during  processing.  For  the  parameter  estimation,  the  least 
squares  method  was  used;  the  fitted  Gaussian  shape  is  shown  in  green  in 
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Figure  18.  The  quality  of  the  fit  was  also  checked  and  the  cases,  where  the  fit  was  unsuccessful  or  of  poor 
quality,  i.e.,  the  fitting  parameter,  the  norm  of  the  residuals  of  the  regression,  was  larger  than  10,  were 
removed. 


Figure  18  Fitting  Gaussian  shape  to  a  typical  waveform 


The  generalized  Gaussian  shape  parameters  were  determined  for  each  waveform,  and  the  statistics  for  the 
five  target  groups  with  different  incidence  angles  were  computed.  Figure  19  lists  the  results  for  all  the 
fitting  parameters  for  each  target  with  all  incidence  angles.  In  addition,  the  number  waveforms  as  well  as 
adjustment  quality  parameters  are  also  included.  The  small  circles  show  the  mean  of  the  parameter  values, 
and  the  vertical  lines  indicate  the  standard  deviations. 

To  identify  trends,  a  2nd  order  polynomial  regression  was  applied  to  the  datasets;  these  curves  are  also 
shown  in  Figure  19,  and  the  norm  of  the  regression  residuals  is  listed  in  the  titles  of  each  figures.  Clearly, 
there  is  a  visible  difference  between  the  highly-reflective  and  normal  reflectivity  targets;  though,  no 
obvious  trend  can  be  observed.  Removing  the  retro-reflective  target  group,  Figure  20  shows  the  statistics, 
allowing  for  better  differentiation  of  targets  with  normal  reflectivity.  Visibly,  the  fitted  polynomials  look 
more  attractive,  but  the  residuals  are  also  noticeably  large. 
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Figure  19  Gaussian  fitting  parameters  as  a  function  of  incidence  angle,  grouped  by  the  five  target  materials  (red:  retro 
reflective,  blue:  wood,  cyan:  fluffy  plastic,  yellow:  cardboard,  and  black:  painted  wood) 
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Figure  20  Gaussian  fitting  parameters  as  a  function  of  incidence  angle,  grouped  by  the  four  target  materials  (blue:  wood, 
cyan:  fluffy  plastic,  yellow:  cardboard,  and  black:  painted  wood) 


3.3.4  Incidence  angle  estimation  with  feed  forward  neural  network 

Since  there  is  no  obvious  analytical  model  for  the  incidence  angle  estimation  from  decomposed 

waveform  parameters,  a  neural  network  based  approach  was  selected,  where  by  learning,  the 

classifier  parameters  can  be  determined.  Feed  forward  (FF)  neural  network  is  a  basic  network 

type  that  can  be  a  good  choice  for  classification,  pattern  recognition  and  several  other  problems. 

In  our  investigation,  the  network  was  trained  for  estimating  the  incidence  angle.  In  neural 

network  applications,  one  of  the  first  questions  is  the  complexity  of  the  network,  such  as  the 

number  of  hidden  layers,  the  number  of  neuron,  etc.  The  selection  is  mostly  done  empirically, 
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based  on  testing  with  various  data.  Another  question  is  the  input  and/or  the  selection  of 
parameters  derived  from  the  input  data.  In  our  case,  we  decided  to  apply  the  more  generic 
approach  of  directly  using  the  waveform,  i.e.,  the  intensity  vectors,  as  it  potentially  allows  for 
better  performance  by  using  all  the  information  in  the  data.  The  output  of  the  network  is  one 
variable  that  is  the  estimated  incidence  angle. 


To  train  a  neural  network,  the  dataset  is  generally  divided  into  two  parts:  training  set  and 
validation  set.  Following  general  practice,  the  training  set  contains  every  waveform  with  odd 
numbers  from  the  dataset  while  waveforms  with  even  numbers  form  the  validation  set.  The  aim 
of  this  selection  methodology  is  to  provide  consistent  data  in  the  training  and  validation  set.  For 
training  the  neural  network,  the  Levenberg-Marquardt  algorithm  was  used.  After  the  training,  the 
network  was  checked  on  the  validation  set.  The  incidence  angles  for  elements  in  the  validation 
set  are  known,  thus  the  “must  ”  and  “is”  value  can  be  compared  as  the  residuals  of  the  system 
model.  In  addition,  the  mean  and  the  standard  deviation  of  the  residuals  are  also  calculated. 


Table  4  shows  the  results  of  the  feed  forward  network  applied  only  to  the  retro  reflective  target 
dataset.  Note  that  as  the  number  of  neurons  is  increasing  the  mean  of  the  deviation  becomes 
worse.  This  is  caused  by  over-fitting  or  over-learning.  In  the  table,  the  best  configuration  is  the 
{4,  2,1}  alignment,  and  for  that  case  various  waveform  sizes  are  analyzed;  the  cut-off  value  in 
the  table  shows  the  number  of  waveform  element  used  in  the  process,  for  instance,  the  value  of 
16  means  that  the  first  16  intensity  values  are  the  input  of  the  neural  network.  The  table  shows 
that  no  significant  differences  between  the  cut-off  values  of  16,  32,  and  50,  suggesting  that  the 
first  16  intensity  values  contain  most  of  the  information  about  the  waveforms,  as  expected. 
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Table  4  Feed  forward  neural  network  for  incidence  angle  determination  for  the  retro  reflective  target 


Hidden  layer(s) 

{6,  3, 1} 

{14,  6, 1} 

{4,  2, 1} 

{4,  2, 1} 

{4,  2, 1} 

{4,  2, 1} 

Input  neurons' 
activation  function 

tangent 

sigmoid 

tangent 

sigmoid 

tangent 

sigmoid 

tangent 

sigmoid 

tangent 

sigmoid 

tangent 

sigmoid 

Hidden  neurons' 
activation  function 

tangent 

sigmoid 

tangent 

sigmoid 

tangent 

sigmoid 

tangent 

sigmoid 

tangent 

sigmoid 

pure 

linear 

Output  neuron's 
activation  function 

tangent 

sigmoid 

tangent 

sigmoid 

tangent 

sigmoid 

tangent 

sigmoid 

tangent 

sigmoid 

tangent 

sigmoid 

Cut  off  [-] 

16 

16 

16 

32 

50 

16 

Norm  of  residuals  [°] 

298.5 

498.0 

240.5 

248.3 

252.0 

301.0 

Mean  [°] 

0.3 

2.2 

0.4 

0.3 

0.3 

0.5 

Standard  deviation  [°] 

9.1 

15.1 

7.4 

7.6 

7.7 

9.2 

The  results  of  neural  network  on  the  whole  dataset  (including  all  targets)  is  shown  in 

Table  1.  Note  that  changing  the  cut-off  edge  value  has  practically  no  impact  on  the  results.  Not 
surprisingly  compared  to  the  retro  target  case,  a  different  network  structure  provides  the  best 
solution,  resulting  in  a  mean  value  of  0.1  degree  and  standard  deviation  of  1 1.7  degree.  To  assess 
this  performance  in  absolute  sense,  the  coarse  sampling  rate  (2  ns),  the  relatively  short  object 
distance,  and  the  small  footprint  should  be  taken  into  account,  and  under  these  conditions,  the  10 
degree  variance  is  a  good  and  realistic  value. 


Table  5  Feed  forward  neural  network  for  incidence  angle  determination  for  all  the  targets 


Hidden  layer(s) 

{4,2,1} 

{4,2,1} 

{4,2,1} 

{14,  6, 1} 

{6,3,1} 

Input  neurons' 
activation  function 

tangent 

sigmoid 

tangent 

sigmoid 

tangent 

sigmoid 

tangent 

sigmoid 

tangent 

sigmoid 

Hidden  neurons7 

activation  function 

tangent 

sigmoid 

tangent 

sigmoid 

pure 

linear 

tangent 

sigmoid 

tangent 

sigmoid 

Output  neuron's 
activation  function 

tangent 

sigmoid 

tangent 

sigmoid 

tangent 

sigmoid 

tangent 

sigmoid 

tangent 

sigmoid 

Cut  off 

50 

16 

16 

16 

16 

Norm  of  residuals  [°] 

856.9 

955.3 

918.1 

867.2 

572.9 

Mean  [°] 

-0.0 

0.1 

0.5 

0.2 

0.1 

Standard  deviation  [°] 

12.2 

13.6 

13.1 

12.4 

11.7 
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3.3.5  Surface  material  classification  based  on  waveform  data 

In  this  task,  a  basic  investigation  was  carried  out  to  identify  potential  correlation  between 
waveform  shapes  and  object  materials.  The  waveform  vector  is  defined  as 


where  w  is  the  waveform,  I™,  is  the  intensity  at  the  time  moment  tj  in  waveform  w.  Given  that 
the  number  of  intensity  values  of  the  two  waveforms  wlr  w2  is  the  same  (i.e.,  same  length)  and 
the  sampling  rate  is  identical  (t(Wl  =  t™2),  the  distance  between  two  waveforms  can  be  defined 
as: 


n 

d(wvw2)  =  -4?l- 

£=l 

Similarly  to  incidence  angle  estimation,  the  dataset  is  divided  to  learning  and  validation  system. 
From  the  learning  dataset  the  average  waveform  is  calculated  for  each  target  as: 

X<m  jwj 

r  _  Li=1  ti 
n,  —  > 

1  m 

where  m  is  the  number  of  the  full  waveforms.  The  average  waveforms  are  determined  for  each 
reflectivity  class.  Then  the  classification  is  based  on  calculating  the  distance  between  an 
incoming  waveform  and  the  average  waveforms,  and  then  the  nearest  one  will  determine  the 
reflectivity  class  of  the  income  waveform: 

min  d(wincome,  w^.), 

C 

where  c  is  the  class  of  reflectivity,  wincome  is  the  incoming  waveform,  and  w~c  is  the  average 
waveform  of  the  reflectivity  class  c.  To  assess  the  performance,  the  algorithm  was  tested  on  the 
validation  set  and  the  results  are  listed  in  Table  6. 
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Table  6  Classification  results  for  three  classes 


Reflectivity  class 

Success  rate 

1 

87.8  % 

2 

0.5  % 

3 

99.2% 

Overall 

84.5% 

Clearly,  Class  2  shows  an  unacceptable  performance,  which  may  be  somewhat  related  to  the 
small  number  of  waveforms  from  target  2,  see  Figure  21.  Note  that  the  shape  of  the  class  2  is 
halfway  between  classes  1  and  2. 


Class  3 


Figure  21  Waveforms  shapes  of  in  the  three  classes 

Merging  classes  2  and  3,  the  results  are  shown  in  Table  7;  the  success  rate  of  the  class  1  slightly 
decreased,  but  the  overall  rate  increased. 
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Table  7  Classification  results  for  two  classes 


Reflectivity  class 

Success  rate 

1 

79.9  % 

2 

99.6  % 

Overall 

93.1% 

3.4  Processing  ScanPos4  Dataset 

The  aim  of  this  investigation  was  to  use  vertical  and  horizontal  walls  to  estimate  the  incidence 
angle.  For  this  reason,  an  area  with  tall  buildings  was  chosen;  see  Figure  22(a)  for  the  scanned 
area.  For  the  vertical  analysis,  a  parking  garage  was  selected  (Building  I,  Figure  22  (b),  and 
Figure  22  (d)),  and  an  office  building  was  the  object  with  some  measurable  height  dimension 
(Building  II,  Figure  22  (c),  Figure  22  (e)).  In  the  case  of  the  office  building,  trees  hide  the  lower 
part  of  the  building,  and  were  subsequently  removed  (Figure  22  (e)).  The  points  of  the  front  wall 
was  used  for  data  analysis  (Figure  22  (g),  Figure  22  (h)). 


(a)  Area  of  scan 
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(c)  Building  II  (office  building) 


(d)  Point  cloud  from  Building  I 


m 


m 


m 


(e)  Point  cloud  from  Building  II 


(f)  Point  cloud  from  Building  II  after  cleaning 
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(g)  Selected  point  cloud  from  the  front  of  Building  I 


(h)  Selected  point  cloud  from  the  front  of 
Building  II 


Figure  22  Buildings  used  in  investigation 


In  the  case  of  Building  I.,  the  angle  of  the  surface  (i.e.  wall)  is  constant,  thus  the  incidence  angle 
only  depends  on  the  angle  of  laser  beam.  The  angle  of  the  wall  that  it  close  with  X-axis  of  the 
coordinate  system  of  the  instrument  was  148. 60  (reverse  clockwise).  The  intensities  of  points 
are  also  same.  The  limits  of  scanned  angle  were  between  75  and  88  degree  on  vertical  plane  and 
between  56  and  81  degree  on  horizontal  plane.  The  waveforms  can  be  seen  in  Figure  21. 


Figure  23  Waveforms  as  functions  of  horizontal  (left)  and  vertical  (right)  incidence  angle  (Building  I) 
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3.4.1  Gaussian  parameters 

Having  performed  Gaussian  shape  decomposition,  the  parameters  were  computed  as  a  function 
of  the  incidence  angle.  Note  that  no  special  trend  properties  can  be  seen. 
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Figure  24  Gaussian  parameters  of  Building  I 


The  results  of  Building  II  can  be  seen  in  Figure  23.  The  horizontal  angles  are  not  incidence  angle 
but  the  angle  of  the  laser  beam  in  the  figure.  Note  that  correlation  can  be  detected  between  the 
magnitude  and  incidence  angle,  but  the  number  of  outliers  is  significant. 
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Figure  25  Gaussian  parameters  of  Building  II 


3.4.2  Solution  with  feed-forward  neural  network 

On  the  ScanPos4  dataset,  we  also  used  the  feed  forward  neural  network  for  incidence  angle 
estimation  as  in  3.2.5.  In  this  case,  16  intensity  values  were  applied  as  input  parameters  of  the 
neural  network,  and  the  outputs  were  the  horizontal  and  vertical  components  of  the  incidence 
angles.  The  network  layer  alignment  was  {6,  4,  2},  and  tangent  sigmoid  type  transfer  functions 
were  used.  Every  odd  point  was  the  training  set,  and  the  validation  set  consisted  of  even  points. 

Applying  neural  network  to  the  Building  I  dataset,  the  mean  of  the  differences  are  nearly  zero  in 
the  case  of  both  incidence  angle  directions.  The  standard  deviation  of  the  vertical  angle  was  3.7°, 
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while  in  the  horizontal  plane  it  was  10.5°.  But  this  result  on  the  examined  dataset  is  clearly  not 
acceptable,  see  Figure  26.  Note  that  though  the  standard  deviation  seems  to  be  low,  yet  the 
measurement  ranges  of  the  angle  is  also  small,  especially  in  the  case  of  vertical  angles. 
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Figure  26  Calculated  and  correct  solution  for  Building  I 


In  the  case  of  Building  II,  the  mean  of  the  residuals  is  also  nearly  zero,  and  the  standard 
deviation  of  the  horizontal  angle  is  8.9°,  while  the  vertical  angle  deviation  is  7.8°.  These  results 
are  also  not  acceptable.  The  comparison  between  the  calculated  and  correct  values  can  be  seen  in 
Figure  27. 
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Figure  27  Calculated  and  correct  solution  for  Building  II 


4.  WAVEFORM-BASED  CLASSIFICATION  FROM  AIRBORNE  LIDAR 

4.1  Background 

Chapter  3  discussed  the  concept  of  waveform  decomposition  and  the  theory  of  applying  the 
decomposed  waveform  parameters  to  incidence  angle  estimation.  Using  terrestrial  scanner  data, 
described  in  Section  3.2,  the  methodology  was  tested  and  the  results  were  evaluated  in  Chapter  3. 
In  the  next  phase,  airborne  LiDAR  data,  acquired  in  a  normal  mission,  was  processed  to  obtain 
performance  evaluation  for  typical  operational  environment.  Extracting  or  identifying  incidence 
angle  features  from  waveforms  allows  to  obtain  object  information  based  on  a  single  waveform; 
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without  knowing  their  spatial  position  or  using  neighborhood  points’  information.  Obviously, 
determining  the  orientation  surface,  in  general,  is  not  a  challenged  topic  when  information  from 
the  point  environment  can  be  used,  but  it  becomes  rather  difficult  if  these  neighbor  points  are  not 
available,  and  the  estimation  has  to  be  based  on  only  the  shape  of  the  waveform. 


4.2  Data  set 

NOAA  acquired  a  unique  airborne  data  set  in  the  summer  of  2013  to  support  comparative 
analysis  of  airborne  LiDAR  data.  A  unique  aspect  of  the  data  acquisition  campaign  was  that 
three  sensor  units  were  installed  in  an  aircraft  to  simultaneously  acquire  data.  The  three  Riegl 
systems  included  the  Q680i  and  Q780  models  that  have  full  waveform  capabilities  and, 
subsequently  were  used  in  this  study.  The  waveform  data  was  sampled  at  1  ns,  and  two 
digitizers,  low  and  high  channels,  were  used  in  both  instruments,  see  Table  8  The  main  parameters 
of  the  NOAA  data  acquisition  data  setsThe  airborne  campaign  covered  two  sites:  Corbin,  VA  and 
Duck,  NC. 


Table  8  The  main  parameters  of  the  NOAA  data  acquisition  data  sets 
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1  II 

OHIO 

SIATE 

UNIVERSITY 


Max.  no  of  returns 

7 

7 

6 

6 

4.3  Data  processing 


This  study  covers  two  types  of  investigation  on  waveforms.  The  first  one  is  classification  by 
incidence  angle,  while  the  second  topic  is  land  type  classification.  Different  methods  and 
approaches  are  used  to  solve  the  classification  problem. 


4.3.1  Data  preprocessing 

The  sensor  level  data  preprocessing  steps  included  the  matching  of  records  from  the  two  data 
formats  (SDF  and  LAS),  conversion  the  binary  data  into  Matlab  readable  format,  etc.  For  the 
classification,  the  different  categories  (classes)  had  to  be  defined,  requiring  the  spatial 
delimitation  of  the  dataset,  as  part  of  the  preprocessing. 

Matching  LAS  and  SDF  records  by  timestamps 

The  LAS  file  specification  supports  the  waveform  storage,  but  in  our  case,  the  waveforms  are 
stored  separately  in  SDF  file.  Since  the  SDF  does  not  contain  the  coordinates  of  the  waveforms 
in  the  object  space,  and,  unfortunately,  it  cannot  be  calculated  because  the  global  coordinates  of 
the  sensor  reference  point  is  not  stored  in  the  files,  the  records  between  LAS  and  SDF  are 
matched  based  on  the  global  reference  timing  (UTC).  Note  that  the  matches  showed  about  0-2  ps 
differences  between  the  timestamps,  equivalent  of  about  600  m  distance  at  the  speed  of  light,  in 
this  time  interval,  which  means  that  the  2  ps  difference  is  just  caused  by  the  different  processing 
techniques,  and  matches  with  this  time  discrepancy  are  fine. 

Footprint  size 

The  average  flight  height  (h)  was  -680  m,  and  the  beam  divergence  (5)  of  the  Riegl’s  Q780 
system  is  0.25  mrad,  thus  the  footprint  can  be  calculated  with  the  following  expression: 

/=  h  *  S  =  0.17  m 
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Calculating  the  scan  angles 


The  Riegl  SDF  file  format  contains  several  parameters  to  georeference  the  waveforms  in  a 
spatial  coordinate  system.  The  locations  of  the  waveform  sample  (si)  can  be  determined  by  the 
following  expression: 


St  =  o  +  d  —  (ti 


ef) 


where  o  is  the  origin  vector,  d  is  the  direction  of  the  emitted  pulse,  tref  is  the  reference  time,  vg  is 
the  group  velocity,  see  Figure  28. 


Figure  28  Locations  in  the  sensor  coordinate  system  from  the  Riegl  Waveform  Extraction  Library  Manual  (Page  6) 


Where  the  direction  and  the  residual  vector  can  be  calculated  by: 

v3 

Sv  =  arctan(-p===z) 

W  +  v22 

where  v  is  any  vector  and  Sv  is  the  direction  of  v  in  the  X-Y  plane.  First,  just  the  d  (direction  of 
the  emitted  pulse)  is  used  to  calculate  this  direction.  The  results  for  a  rooftop  are  shown  in  Figure 
29. 
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Figure  29  Laser  beam  directions 


The  LAS  files  contain  the  scan  angle,  but  the  precision  of  these  data  is  at  degree-level.  The 
comparison  of  the  calculated  scan  angle  from  the  direction  vector  ( v  =  o )  with  the  scan  angle 
provided  by  the  LAS  file  can  be  seen  in  Figure  30.  Note  that  the  11°  and  12°  scan  angles  are 
from  the  LAS  file,  and  the  red  graph  in  the  figures  shows  the  calculated  scan  angles.  The  dashed 
lines  depict  the  assumed  rounding  limits. 
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Sample  [#] 

LAS  scan  angle:  12° 


Figure  30  Scan  angles  from  LAS  file  and  calculated  values  from  the  SDF  file  using  the  direction  vector 


The  direction  is  also  determined  from  the  sensor  reference  point  by  calculating  the  residual 
vector  of  the  origin  and  direction  vectors  of  the  emitted  pulse  (v  =  o  +  d ).  In  this  case  the 
comparison  between  LAS  and  the  SDF  files  shows  larger  discrepancies,  see  Figure  3 1 . 


Sample  [#] 

LAS  scan  angle:  12° 


Figure  31  Scan  angles  provided  by  LAS  file  and  the  calculated  angles  from  the  SDF  file  using  the  origin  and  direction  vectors 
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4.3.2  Feature  extraction 


The  features  extraction  provides  the  input  data  for  the  all  the  subsequent  processing.  The  feature 
vector  includes  both  derived  and  original  samples  from  the  waveforms;  Table  9  lists  the  features 
used  in  this  investigation. 


Table  9  Features 


# 

Feature  (parameter) 

Description 

FP1 

Parameters  of  Generalized 

Gaussian 

Parameter  vector  of  the  fitting  by  the  generalized 
Gaussian  function 

FP2 

Kurtosis  and  skewness 

The  statistical  estimation  of  kurtosis  and  skewness 
from  the  samples 

FP3 

Vector  of  waveform  samples 

The  waveforms  are  represented  as  the  vector  of  the 
intensities  (samples) 

FP4 

Translated  vector  of  waveform 
samples 

Same  as  P4,  but  the  maximum  place  is  translated 
to  the  middle  of  the  vector 

FP5 

Median  waveform 

Median  waveforms  calculated  from  the  classified 
waveforms,  generally  the  groups  are  determined 
by  the  user  (training-validating  process) 

Generalized  Gaussian  function  parameters 

There  are  four  parameters  to  model  the  generalized  Gaussian  function  as  defined  below: 

r(*-p?.)iPs 

G(x,p)  =  p1e  l  P3  1  +  p4 

where  pi  is  the  amplitude,  p2  is  the  position  of  the  peak  (mean),  p3  is  the  dispersion  of  the 
function  (standard  deviation),  ps  is  the  shape  parameter  (it  is  Gaussian  distribution,  if  ps  =  2),  and 
P4  is  the  translation  in  Y  direction.  Figure  1  shows  waveform  shapes  after  changing  one  of  the 
parameter. 
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p4 


60 


Figure  32  The  effect  of  the  parameters  on  the  shape  of  the  function 


The  parameter  estimation  is  based  on  the  minimization  of  the  L2  norm  of  the  residuals. 
Assuming  that  the  (x,y)  pairs  are  the  digitized  discrete  samples  of  the  waveform,  the  Gaussian 
function  can  be  determined  by  the  following  expression: 

min||y  -  G(x,p)\\ 

v 

The  optimization  problem  was  solved  by  numerical  methods  using  Matlab. 

Kurtosis  and  skewness 

The  general  discussion,  including  illustrations,  is  provided  in  3.3.3.  These  parameters  are 
estimated  from  the  standardized  third  and  fourth  central  moments  of  the  samples  by  the 
following  expressions: 

_  E[X  -  ji]4  __  A"=|(l‘  “  -4)4 

tr“  (izr-ita  -  «2)2 

and 
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5  = 


E[X-/j]3 


where  X  is  the  random  variable  (vector)  of  the  samples,  p  is  the  expected  value,  a  is  the  standard 
deviation,  xt  are  the  samples,  x  is  the  sample  mean  and  n  is  the  sample  size. 

Vector  and  translated  vector  of  waveform  samples 

The  representation  of  the  waveform  in  the  SDF  file  is  realized  as  the  C-type  vector  of  the 
intensity  values,  called  sample  vector.  The  time  differences  between  the  vector  elements  are 
constant  1  ns.  Obviously,  this  vector  can  be  the  input  of  the  classifier.  The  advantage  of  using 
these  vectors  is  all  the  information  of  the  waveform  is  preserved.  The  disadvantage,  obviously,  is 
the  high  dimensionality  of  the  input  vector.  Furthermore,  there  is  data  redundancy  that  is  not 
exploited  for  input  data  dimension  reduction.  Thus,  in  an  extreme  case,  the  classifier  may  not  be 
able  to  detect  the  differences  due  to  handling  the  large  number  of  input  variables.  Note  that  no 
time  information,  such  as  start  time  of  the  wave,  etc.,  is  contained  by  these  vectors. 

The  length  of  the  sample  vectors  in  the  two  data  sets  can  be  60  or  120,  and  the  values  are  from 
the  low  or  the  high  channel  digitizers.  In  this  study,  data  from  the  low  channel  was  used,  as  more 
waveforms  are  available  from  that  sensor.  The  length  of  the  sample  vectors  from  the  low  channel 
is  60;  those  few,  of  which  the  vector  size  is  120,  was  removed  from  the  examined  dataset. 

Analyzing  the  sample  vectors,  it  was  noted  that  the  sample  maximum  peak  locations  of  the 
waveforms  are  fluctuating  with  1-2  indices  around  the  median  peak  location.  It  is  probably 
caused  by  the  digitizing  process  of  the  LiDRA  sensor;  no  explanation  was  provided  by  the 
vendor.  Further  investigations  proved  that  the  location  of  the  maximum  peak  has  no  information 
about  the  object  from  that  was  backscattered,  and,  therefore,  these  differences  may  trouble  the 
classification  process.  For  this  reason,  the  elimination  of  these  differences  is  preferred  to  obtain 
better  performance.  The  elimination  process  is  shown  in  Figure  33  and  Table  10,  the  whole 
samples  (1)  move  to  the  standard  location  of  the  peak  (2),  and  the  empty  sample  places  are  filled 
with  the  first  sample  value  (3).  Note  again  that  the  translated  vectors  have  no  timing 
information. 
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Figure  33  Shift  correction  of  waveforms 


Table  10  Shift  corrections 


Sample  # 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

(1) 

Original  vector 

1 

4 

7 

8 

6 

3 

2 

1 

1 

1 

(2) 

Moving 

1 

4 

7 

8 

6 

3 

2 

1 

1 

1 

(3) 

Translated  vector 

1 

4 

7 

8 

6 

3 

2 

1 

1 

added,  removed 


Median/average  waveforms 

The  median  or  the  average  waveforms  can  be  calculated  from  the  sample  vectors  or  translated 
sample  vectors.  An  example  of  the  computation  is  shown  in  Figure  34  and  Table  11. 

The  median  and  average  waveforms  are  determined  as  the  part  of  the  training  process,  when  the 
classes  of  the  waveforms  are  determined  by  the  investigator.  The  residual  vectors,  such  as 
median  and  average  vectors,  are  calculated  for  each  class  from  the  class  members,  and  thus, 
theoretically,  these  estimated  common  vectors  are  the  typical  sample  vectors. 
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Figure  34  Median  and  average  waveforms 

Table  11  Median  and  average  waveforms 


Vector  # 

Samples 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

1 

1 

4 

7 

6 

8 

7 

2 

1 

1 

1 

2 

1 

4 

6 

7 

9 

8 

5 

1 

1 

1 

3 

2 

3 

4 

7 

8 

6 

3 

2 

1 

1 

4 

1 

2 

3 

4 

6 

5 

3 

1 

2 

1 

5 

1 

5 

2 

3 

9 

7 

4 

3 

2 

1 

Median 

1 

4 

4 

6 

8 

7 

3 

1 

1 

1 

Average 

1 

4 

4 

5 

8 

7 

3 

2 

1 

1 

4.4  Data  classification 

The  classifiers  tested  in  this  investigation  are  listed  in  Table  12,  including  the  input  parameters 
used.  Of  course,  the  question  of  selecting  the  optimal  classifier  for  a  certain  data  characteristics, 
in  general,  is  difficult.  Similarly,  the  input  parameter  selection  has  challenges  too,  though,  it  is 
less  problematic.  This  is  the  reason  that  several  features,  extracted  from  waveforms  are  tested 
with  the  most  commonly  used  classifiers  (that  seemed  to  be  adequate  for  the  waveform  data). 
Finally,  the  selection  of  classes  is  also  part  of  the  classification  process.  Note  that  in  most  cases, 
supervised  classification  is  used;  the  classes  are  defined  by  experts. 
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Table  12  Classifiers 


Feature 

Method 

Description 

FP1,  FP2 

Linear  discriminant  analysis 

Linear  classifier,  based  on  statistical 
considerations,  normal  distribution  of  the  input 
assumed 

FP5 

Measuring  distance  from  median 
waveform 

The  closest  matching  median  waveforms 
indicate  the  class 

FP1,  FP2, 
FP3,  FP4 

Feed  forward  neural  networks 

Commonly  used  neural  network,  which  can 
handle  non-linear  classification  problems,  and 
also  effective  for  other  problems  (regression, 
etc.) 

FP4 

Self-organizing  map  neural 
network  (Kohonan  network) 

This  type  of  neural  network  is  used  for 
unsupervised  classifying,  and  can  detect  the 
similarities  between  the  input 

4.4.1  Classifiers 
Linear  discriminant  analysis 

Linear  discriminant  analysis  (LDA)  is  a  widely  used,  general  statistical  tool  for  classification. 
The  method  assumes  that  the  linear  combination  of  the  features  can  produce  the  class 
information: 


Dj  —  d0j  +  d1jx1  +  d2jx2  +  — I-  dkjxki 

where  Dj  is  the  /th  discriminant  function,  the  d0j  ...  dkj  are  the  “weights”  or  coefficients  of  the 
function  and  x1...xk  are  the  elements  of  the  feature  vector.  The  /th  discriminant  function 
measures  whether  the  sample  is  included  by  the  /th  class  or  not.  The  discriminant  analysis 
assumes  that  the  independent  variables  (the  features)  are  of  normal  distribution,  and  they  follow 
different  normal  distributions  in  the  classes.  In  other  words,  different  classes  generate  samples 
corrupted  by  different  type  of  normal  error.  Note  that  the  method  has  other  assumptions. 

During  the  training  phase,  the  algorithm  estimates  the  coefficients  of  the  discriminant  function. 
In  the  validation  phase,  when  the  classifier  is  used,  the  discriminant  functions  have  to  be 
evaluated,  which  probvides  the  class  prediction.  In  this  study,  this  classifier  is  used  for  land  type 
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classification,  based  on  the  Gaussian  parameters  and  the  kurtosis-skewness  pairs  as  feature 
vector  for  inputs 


Measuring  distance  from  median  waveform 

This  method  is  very  simple  and  efficient  in  computational  sense.  As  it  was  presented  above,  each 
median  or  average  waveform  represents  one  class.  The  idea  is  to  calculate  the  distances  between 
the  examined  waveforms  and  the  median  or  the  average  waveform.  The  shortest  distance 
indicates  the  class.  An  example1  is  shown  in  Table  13.  The  sample  vector,  to  be  classified,  is  in 
the  first  row  and  the  median  sample  vectors  are  in  the  2nd  and  3rd  line.  First,  the  distance  of  the 
samples  has  to  be  calculated,  this  distance  can  be  defined  by  different  measures,  and  here  L2 
norm  (the  squared  root  is  the  Euclidean  distance)  is  used.  Finally,  these  distances  have  to  be 
summarized  and  the  smallest  sum  indicates  the  class;  Class  1  in  this  example. 


Table  13  Classification  using  L2  norm 


Samples 

Sum 

Sample  vector 

1 

2 

5 

6 

8 

6 

2 

1 

1 

1 

Class  1  median  waveform 

1 

1 

3 

6 

9 

7 

4 

3 

2 

1 

Class  2  median  waveform 

1 

4 

6 

7 

9 

8 

5 

1 

1 

1 

Distance  from  Class  1 

0 

1 

4 

0 

1 

1 

4 

4 

1 

0 

16 

Distance  from  Class  2 

0 

4 

1 

1 

1 

4 

9 

0 

0 

0 

20 

In  another  approach,  the  maximum  norm  is  used  instead  of  the  L2  norm.  The  usage  of  this  norm 
is  suggested  by  its  robustness  and  its  connection  with  other  statistical  assumptions  (Kolmogorov- 
Smimov  distance).  In  this  case,  the  class  is  selected  where  the  maximum  distance  between  the 
sample  and  median  waveforms  is  minimal.  Table  14  shows  an  example. 


Table  14  Classification  using  max  norm 


Samples 

Max 

Sample  vector 

1 

2 

5 

6 

8 

6 

2 

1 

1 

1 

Class  1  median  waveform 

1 

1 

3 

6 

9 

7 

4 

3 

2 

1 

Class  1  median  waveform 

1 

4 

6 

7 

9 

8 

5 

1 

1 

1 

Distance  from  Class  1 

0 

1 

2 

0 

1 

1 

2 

2 

1 

0 

2 

Distance  from  Class  2 

0 

2 

1 

1 

1 

2 

3 

0 

0 

0 

3 

1  http://davis.wpi.edu/~matt/courses/soms/ 
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Feed  forward  neural  networks 


The  feed  forward  neural  network  is  a  general  type  of  neural  network  which  can  be  used  for 
regression,  classification,  pattern  recognition,  etc.  In  this  study,  the  feed  forward  networks  are 
used  as  a  pattern  recognition  tool  because  the  expectations  are  that  the  waveform  similarities  can 
be  recognized.  The  input  of  the  network  is  the  sample  vectors.  To  reduce  the  dimensionality, 
only  half  of  the  waveform  record  is  used;  earlier  tests  indicated  that  the  regions  away  from  the 
pulse  have  less  contribution.  The  input  vector  to  the  network  is  formed  from  indices  10th  to  40th. 
The  implementation  was  based  on  the  Matlab  built-in  feed  forward  network  {patternnet ),  which 
was  especially  developed  for  pattern  recognition.  Several  configurations  of  hidden  layers  and 
neurons  were  tested,  and  the  results  showed  that  the  best  performance  was  achieved  with  2 
hidden  layers  and  neurons  of  5  and  10,  respectively,  shown  in  Figure  35.  The  activation 
functions  are  of  sigmoid-type  in  the  hidden  layers,  and  the  performance  was  measured  by  cross¬ 
entropy. 


Hidden  1 


Hidden  2 


Output 


ft©  ^ 


Figure  35  Final  configuration  of  the  pattern  recognition  neural  network 


The  number  of  output  depends  on  the  number  of  the  classes.  If  there  are  two  classes,  the  output  is 
1  or  0,  which  indicates  that  the  sample  of  the  input  belongs  to  Class  1  or  not.  In  this  study,  this 
type  of  network  is  used  with  2  classes  roof  extraction;  discussed  in  subsequent  section.  In  the 
training  process,  N  number  of  sample  vectors  of  3 1  samples  were  used  with  their  predetermined 
classes  on  the  output.  After  forming  the  networks,  the  validation  was  executed  on  M  number  of 
sample  vectors. 

In  order  to  prove  the  ability  of  the  feed  forward  neural  networks  for  classifying  the  waveforms  as 
a  pattern  recognition  problem,  a  preliminary  simulation  study  was  performed.  Two  reference 


52 


Exploitation  of  Full- waveform  LiDAR  to  Characterize/Exploit  under  Canopy  Targets  -  Foliage  Penetration  (FOPEN) 


waveforms  were  defined;  see  the  upper  part  of  Figure  36.  The  Ref2  waveform  is  the  modification 


of  Ref  1  with  increasing  the  p2  and  p3parameters  with  +0.2. 


Figure  36  References  (upper)  and  simulated  waveforms  (lower) 


The  simulated  waveforms  are  created  from  these  reference  waveforms  by  adding  a  Gaussian 
noise  with  1.5  standard  deviation  and  shifting  the  sample  indices  randomly.  The  results  as 
confusion  matrix  can  be  seen  in  Figure  37.  Note  that  the  total  performance  is  88.9%  in  the 
validation  set,  which  implies  that  this  type  of  network  can  distinguish  the  waveforms,  that  comes 
from  different  base  signal,  such  as  the  waveform  that  is  typical  in  the  class.  In  the  simulation, 
600  random  samples  were  used. 
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Figure  37  The  pattern  recognition  feed  forward  neural  network  results  on  the  simulated  data 


Self-organizing  map 

The  self-organizing  maps  (SOM),  aka  Kohonen  networks,  are  different  neural  networks,  as  these 
types  of  networks  are  very  effective  for  clustering  the  data  without  prior  knowledge.  This 
process  is  the  so-called  unsupervised  learning.  SOM  applies  neighbor  functions  to  keep  the 
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topology  of  the  input  properties.  The  method  can  detect  the  similarities  between  the  input 
patterns2 3.  The  SOM  network  determines  groups  (clusters),  in  which  the  features  are  close  enough 
to  each  other.  These  are  also  called  as  classes,  but  the  word  “classes”  is  restricted  here  for  the 
user  defined  classes,  and  the  word  “group”  for  SOM  created  classes  is  used 


One  of  the  disadvantages  of  the  SOM  is  that  the  groups  are  not  known,  as  mentioned  above. 
Therefore,  a  statistical  comparison  between  the  classes  found  by  the  SOM  and  the  actual  classes 
provided  by  the  investigator  is  used.  This  evaluation  results  in  the  decision  on  which  SOM 
groups  represent  which  “real”  class  or  classes.  Furthermore,  it  may  happen  that  more  than  one 
SOM  group  belong  to  one  or  more  “real”  classes,  and  more  than  one  “real”  class  cover  one  or 
more  SOM  groups.  Finally,  a  SOM  group  may  have  no  pair  within  the  “real”  classes,  indicating 
similarities  in  data,  which  has  not  been  considered  earlier. 


The  general  configuration  of  the  SOM  network  is  shown  in  Figure  38.  In  this  study,  the  translated 
sample  vectors  were  used  as  input.  The  translation  is  required  because  of  the  “keep  the  topology” 
property  of  the  SOM.  Therefore,  all  the  60  samples  of  the  vectors  are  used  as  features;  no 
elements  were  removed  to  prevent  any  loss  of  waveform  information.  The  number  of  the  output 
of  a  SOM  network  depends  on  the  layer  structure.  The  network  layer  starting  configuration  is  a 
NxM  dimensional  neuron  grid;  thus,  for  example,  2x2  grid  has  4  outputs,  while  a  3x3  grid  has  9 
outputs.  The  dimension  determines  the  number  of  groups  that  will  be  determined  by  the  SOM; 
thus,  4  outputs  provide  the  4  groups  of  the  clustered  sample  vectors. 


Figure  38  Configuration  of  the  SOM  network3 


2  http://www.mathworks.com/help/nnet/ug/cluster-with-self-organizing-map-neural-network.html 

3  http://www.mathworks.com/help/nnet/gs/cluster-data-with-a-self-organizing-map.html 
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In  the  training  session,  there  is  no  need  to  know  prior  the  classes  of  the  training  sample  vectors. 
The  SOM  will  discover  those  hidden  properties,  and  will  decide  on  the  grouping  of  the  sample 
vectors.  Thus,  the  inputs  of  the  network  are  the  number  of  the  groups  and  the  sample  vectors. 
After  the  SOM  clustered  the  waveforms  into  groups,  the  match  to  the  “real”  classes  should  be 
determined.  These  correspondences  are  done  by  comparing  the  suggestions  of  the  SOM  and  the 
“real”  classes;  for  this  reason,  knowing  the  prior  classes  of  the  input  vectors  is  necessary. 


The  SOM  will  provide  the  “weights”  for  the  60  input  samples.  These  weights  represent  the 
common  sample  vector  of  the  SOM’s  class.  In  the  validation  phase,  the  algorithm  measures 
distances  between  the  weights  and  the  validation  sample  vectors  to  decide  on  which  class  they 
belong  to.  The  closest  one  will  determine  this  class.  This  process  is  same  as  it  was  introduced  in 
the  “Measuring  distance  from  median  waveform”  section,  but  the  median  waveforms  are  the 
weight  vectors  in  this  case. 


4.4.2  Confusion  matrix 

The  concept  of  the  confusion  matrix  is  the  most  widely  used  framework  to  validate  the 
performance  of  classification  methods.  Here,  a  short  discussion  is  provided. 

Table  15  Confusion  matrix 


roadl 

road2 

grass 

tree 

building 

False 

negative 

15 

1 

11 

0 

0 

roadl 

55.6% 

2.0% 

21.2% 

0.0% 

0.0% 

44.4% 

0 

10 

0 

0 

0 

road2 

0.0% 

41.7% 

0.0% 

0.0% 

0.0% 

0.0% 

0 

0 

25 

2 

0 

grass 

0.0% 

0.0% 

65.8% 

4.8% 

0.0% 

7.4% 

0 

13 

0 

15 

0 

tree 

0.0% 

33.3% 

0.0% 

50.0% 

0.0% 

46.4% 

0 

0 

0 

0 

building 

0.0% 

0.0% 

0.0% 

0.0% 

0.0% 

False 

positive 

0.0% 

58.3% 

30.6% 

11.8% 

0.0% 

71.9% 
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Table  15  shows  an  example;  the  rows  represent  the  classes  of  the  waveforms,  are  the  columns 
are  the  classes  selected  by  the  classifier.  Thus,  the  bold  numbers  in  the  (ij)  cells  give  us  the 
number  of  those  waveforms  which  are  in  the  ith  class,  while  the  classifier  selected  the  jth  class.  If 
the  classifier  works  perfectly,  all  the  elements  except  the  diagonals  should  be  zero.  The 
percentages  under  the  numbers  in  the  table  are  calculated  by  the  following  expression: 


CU  = 


n 


ij 


Xk= i  nk,j  T  Xfc=i  ni,k  ni,j 


100  %, 


where  n tj  is  the  number  of  the  (ij)  cell  and  N  is  the  number  of  the  classes.  Thus,  the  percentage 
shows  the  ratio  of  the  total  matches  and  mismatches;  obviously,  it  is  independent  from  the 
number  of  the  points  within  the  classes  The  cells  are  color-coded;  darker  color  indicates  higher 
percentage.  If  the  same  shades  repeat  within  a  column,  it  means  that  the  classifier  cannot 
distinguish  the  classes  properly;  for  example,  see  the  column  of  road2.  The  ratios  of  false 
positives  and  false  negatives  are  found  in  the  last  rows  and  columns.  The  false  negative  shows 
the  ratio  of  the  mismatches,  when  the  class  indicated  by  the  column  was  selected,  but  the  class  of 
the  row  was  supposed  to  be  selected.  The  calculation  is  the  following: 


Ni  = 


n i 


Zk=ink,j 


*  100  % 


In  the  case  of  false  positive,  the  classifier  chooses  the  class  of  the  row,  but  it  was  supposed  to  be 
in  the  class  of  the  column: 


Pj  =  *  100  % 


Ek= i  rijk 


Finally,  the  total  classification  error  is  the  ratio  of  the  all  of  the  matches  against  the  all  samples: 


T  = 


*  100  %. 


The  evaluation  of  the  total  matches  depends  on  the  number  of  the  classes.  For  example,  if  two 
classes  are  examined,  the  50%  match  rate  is  same  as  randomly  choosing  a  class  (flip  the  coin). 
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But  if  the  number  of  classes  is  greater,  the  50%  can  be  evaluated  as  a  better  result  as  the  random 
selection.  The  total  ratio  of  a  random  classifier  is  -  *  100%. 


4.4.3  The  library  and  tools 
Tools 

A  MATLAB  library  was  developed  to  support  this  project,  including  basic  data  manipulation 
operations,  organizing  the  data,  implementing  algorithm,  and  to  evaluate  and  visualize  results. 
The  source  code  is  available  from  Google  Codes  using  SVN  connection 
(https://code.google.eom/p/lidar-wf-classification/.  2014).  All  the  tools  used  are  listed  in  Table  16 


Table  16  Software  and  3rd  party  libraries  (links  from  2014) 


LASTools 

LAS  file  operation,  some  functions  are  free,  others  are  limited, 
download  from  http://www.cs.unc.edu/~isenburg/lastools/ 

Riegl’s  Waveform  extraction 
library 

Riegl’s  library  for  extracting  data  from  SDF  file 

Matlab 

General  mathematical  framework 

FugroViewer 

Free  LAS  file  viewer  from  Fugro  Ltd.,  download  from 
http://www.fugroviewer.com/reauest/default.asp 

QGIS 

Free  and  open  source  GIS  desktop  application,  download  from 
http://www.agis.org/en/site/ 

4.5  Classification  evaluation 
4.5.1  Incidence  angle  estimation 
Data  preparation 

The  estimation  of  the  incidence  angle  from  waveform  shape  deformation  was  investigated  based 
on  using  targets  in  Chapter  3.  Here  the  focus  was  on  using  typical  airborne  data,  in  which  case  no 
ground  control  is  available  in  general.  Therefore,  objects  that  can  be  easily  described  by 
geometrical  primitives  should  be  considered  in  order  to  obtain  some  local  relative  reference 
necessary  for  the  evaluation.  For  this  reason,  a  point  cloud  from  a  roof  was  selected  from  the 
Corbin  data  set,  see  Figure  39.  The  scan  angle  of  these  points  covers  11-12°.  Note  that  the  point 
cloud  was  divided  into  two  classes  (data  sets),  depicted  by  green  and  red.  Both  point  sets  define  a 
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plane,  and  the  angle  between  these  planes  is  around  60°.  As  the  normal  vectors  of  the  planes  are 
notably  different,  the  material  of  the  backscattering  object  (target)  is  the  same,  and  the  scan  angle 
is  also  nearly  identical,  the  comparison  of  the  waveforms  between  these  two  classes  is  expected 
to  show  the  difference  caused  by  the  incidence  angle. 


Figure  39  Roof  with  two  classes  (left)  and  aerial  photo  of  the  building  (right) 


In  order  to  make  sure  that  all  the  points  belong  to  the  roof,  a  fitting  plane  was  estimated  for  both 
sides,  and  points  with  a  distance  from  the  plane  more  than  3 -times  of  the  standard  deviation  of  all 
distances  (3-sigma  rule)  were  removed  from  the  point  sets.  The  outputs  of  this  calculation  with 
the  results  are  shown  in  Figure  40  and  Table  17. 
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Figure  40  Plan  fitting  and  outlier  removal 


Table  17  Statitics  of  point  removal 


Class  1 

Class  2 

Number  of  points 

172 

158 

Standard  deviation  (STD)  [m] 

0.1 

0.3 

Number  of  removed  points 

7 

4 

STD  after  removal  [m] 

0.04 

0.03 

Gaussian  parameter  estimation 

The  generalized  Gaussian  parameters  were  calculated  for  both  classes  based,  as  described  in 
section  4.3.2.  The  averages  and  standard  deviations  of  the  parameters  and  the  histograms  for  all 
parameters  are  shown  in  Figure  41  and  Table  18.  Note  that  the  parameters  are  quite  similar  for 
both  classes;  the  order-of-magnitude  of  the  differences  are  two  times  less  than  the  standard 
deviation.  Therefore,  the  classification  based  only  on  these  parameters  is  difficult. 
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Figure  41  Histograms  of  the  six  Gaussian  parameters 
Table  18  Statistics  of  the  Gaussian  parameters 


Pi 

P2 

Ps 

P4 

Ps 

Class  1 

AVG 

133.131 

17.367 

2.637 

3.858 

2.001 

STD 

7.933 

1.210 

0.100 

0.279 

0.000 

Class  2 

AVG 

134.336 

17.376 

2.623 

3.935 

2.001 

STD 

10.764 

1.204 

0.064 

0.914 

0.001 

Unfortunately,  calculating  the  skewness  and  the  kurtosis  parameters  resulted  in  a  similar 
conclusion,  i.e.,  the  average  and  the  median  values  are  very  close  to  each  other  for  the  two 
classes,  and  the  standard  deviation  is  much  higher,  see  Table  19  and  Figure  42. 


Table  19  Skewness  and  Kurtosis  results 


Skewness 

Kurtosis 

AVG 

Median 

AVG 

Median 

Class  1 

2.291 

2.292 

6.909 

6.911 

Class  2 

2.295 

2.296 

6.927 

6.939 
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Figure  42  Histograms  of  skewness  and  kurtosis;  mean  and  median  are  marked  be  red  and  green,  respectively 


Although  the  first  look  of  the  Gaussian,  skewness  and  kurtosis  parameters  show  no  desired 
differences,  a  routine  discriminant  analysis  was  performed  on  these  parameters.  The  results  as 
confusion  matrix  are  shown  in  Figure  43.  Note  that  the  success  ratio  is  53.9%,  which  is  about 
same  as  just  randomly  selecting  a  class  (50%). 
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Figure  43  Confusion  matrix  from  the  discriminant  analysis 
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Median  waveforms 


The  above  presented  results  clearly  indicate  that  the  Gaussian  parameters  cannot  be  used  for 
describing  the  changes  on  waveforms  due  to  different  incidence  angles.  Therefore,  the  next 
logical  step  is  to  use  the  entire  waveform  for  classification,  as  opposed  to  the  six  parameters. 

First,  the  average  and  median  waveforms  are  analyzed  for  the  classes;  Figure  44  shows  these 
functions,  marked  by  red  and  blue  lines.  The  ranges  of  the  standard  deviation  at  the  samples  are 
depicted  with  dashed  lines  and  then  the  black  lines  indicate  the  minimum  and  maximum  values. 
Note  that  spline  interpolation  is  used  on  the  data. 


Classl 


Class2 


Figure  44  Typical  waveforms  with  spline  interpolation  used  for  visualization 

Figure  45  shows  the  median  waveforms  between  the  10  and  40  samples,  dashed  lines  depict  the 
range  of  the  standard  deviation.  Note  that  the  upper  and  lower  bounds  are  not  parallel  because  of 
the  spline  interpolation.  Analyzing  the  figure,  a  little  difference  can  be  noticed  between  the  two 
classes  around  the  maximum  peak,  but,  unfortunately,  the  standard  deviation  is  much  higher  than 
this  difference,  consequently,  it  is  statistically  not  relevant. 
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Median  Waveforms 


Figure  45  Median  waveforms  for  the  two  classes 

Neural  networks 

As  it  was  presented  above  the  Gaussian  parameters  and  the  median  waveforms  could  not  provide 
acceptable  results.  Therefore,  as  the  last  attempt,  pattern  recognition  neural  networks  with  the 
translated  sample  vectors  as  inputs  were  tested;  the  neural  network  may  provide  better  result  due 
to  its  capability  to  handle  nonlinear  classification  problems.  The  85%  of  the  data  formed  the 
training  set  and  15%  of  the  waveforms  were  used  for  validation.  The  results,  including 
visualization  are  shown  in  Figure  46.  The  Figures  46  (a)-(c)  show  the  confusion  matrix  on  the 
overall,  the  training  and  the  validation  set.  As  we  can  see,  the  total  score  is  -58%  which  is  the 
best  result  comparing  the  other  classification  methods,  but  it  is  still  just  slightly  better  than 
choosing  the  classes  by  coin  flipping  (50%). 
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(a)  Confusion  matrix  of  overall  data 


(b)  Confusion  matrix  of  training  set 
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(c)  Confusion  matrix  of  validation  set 


(d)  Translated  sample  vectors  (waveforms) 
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(e)  Histogram  of  the  Neural  network 


(f)  Classified  points  in  the  3D  space 


Figure  46  Neural  network  based  classification  performance 


Discussion 

Based  on  the  test  data  set,  the  classification  of  the  Gaussian  parameters,  extended  with  skewness 
and  kurtosis,  showed  that  the  classification  performance  is  at  the  same  level  as  the  error  by 
choosing  a  class  with  flipping  coins  (50%).  It  indicates  that  these  parameters  are  unable  to 
predict  the  incidence  angle.  Using  the  sample  vectors,  a  slightly  better  performance  can  be 
achieved,  about  ~60%,  but  it  is  still  not  an  acceptable  rate.  Overall,  the  roof  tests  confirmed  that 
no  reliable  results  on  the  impact  of  incidence  angle  on  the  waveform  can  be  obtained. 

4.5.2  Clustering  with  SOM  neutral  network 

After  the  negative  outcome  of  the  roof  tests,  a  SOM  neural  network  to  detect  any  features  in  a 
larger  dataset  was  used.  Figure  47  shows  this  selected  dataset,  a  120  m  long  paved  road.  The 
surface  of  the  road  is  nearly  planar,  the  changes  in  elevation  is  less  than  0.5  m.  The  scan 
direction  was  perpendicular  to  the  road,  and  the  scan  angles  vary  between  11°  and  21°.  Here  the 
objective  is  similar:  (1)  detecting  the  scan  angle  from  waveform,  and  (2)  the  impact  of  the 
topography  on  the  performance, 
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Figure  47  Road  test  area;  orthophoto  (a),  altitude  (b),  and  scan  angle  (c) 


Using  sample  vectors 

In  our  first  approach,  the  original  length  sample  vectors  were  used  without  any  translation.  The 
neural  network  was  the  same  as  it  was  introduced  earlier.  The  neuron  configuration  was  2x2, 
which  implies  that  4  groups  are  determined  by  the  SOM.  The  detected  classes  are  shown  in  a 
local  coordinate  system  in  Figure  48. 
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(b)  Original  sample  vectors  by  groups  (c)  Weights 


Figure  48  SOM  result  from  original  waveforms 


In  Figure  48,  the  lines  between  the  15  and  -5  values  of  the  Y  axis  show  the  class  densities,  the  X 
axis  is  split  to  equidistance  sections,  and  the  number  of  the  points  within  the  classes  is 
accumulated  like  a  histogram.  These  lines  indicate  that  the  densities  of  the  different  points  in  the 
direction  of  X  axis  are  uniform. 
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In  Figure  48,  the  pavement  markings  on  the  road  (Groups  3  and  4)  and  the  road  body  (Groups  1 
and  2)  are  easily  distinguishable.  In  addition,  a  pattern  is  also  noticeable  in  the  data,  see  the 
waveforms  regarding  the  groups  in  In  Figure  48b  and  the  weights  provided  by  the  SOM  in 
Figure  48c.  Note  that  the  location  of  the  maximum  peaks  of  Groups  1  and  3  are  different  than  in 
the  case  of  Groups  2  and  4.  We  assume  that  these  differences  are  caused  by  the  measuring  or  the 
processing  components  of  the  LiDAR  system,  and  it  does  not  indicate  any  impact  of  incidence 
angle  or  other  phenomenon. 


Using  translated  sample  vectors 

In  the  next  step,  the  translated  sample  vectors  were  applied  to  the  same  SOM  network.  Figure  49 
shows  the  results  of  this  test.  Note  that  the  patterns  are  disappeared,  as  expected.  The  Group  4  is 
clearly  representing  the  pavement  markings.  Group  2  may  be  able  to  detect  the  topographic 
changes,  compare  this  result  to  Figure  48b.  The  Group  3  and  Group  4  may  be  interpreted  as  the 
detection  of  the  changes  of  the  scan.  The  density  line  of  Group  3  shows  that  the  numbers  of  the 
Group  3  points  are  higher  at  the  beginning  and  it  is  decreasing,  while  the  Group  1  point  density 
is  lower  at  the  beginning  and  higher  in  the  end.  Note  that  the  changes  of  the  scan  angles  are  in 
the  X  direction,  see  Figure  47c,  and  thus  the  Groups  3  and  4  may  predict  the  impact  of  the 
incidence  angle. 
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Figure  49  SOM  result  from  translated  waveforms 


4.5.3  Land  classification 

The  aim  of  this  examination  is  to  classify  the  waveforms  by  the  backscattering  object  type.  In 
this  study,  the  five  types  of  land  categories  are  investigated.  Figure  50  shows  these  categories: 
roadl  (paved  road),  road2  (mud  road),  grass,  building  and  tree. 

The  point  clouds  of  these  categories  are  selected  by  the  area,  as  it  is  shown  in  Figure  50.  Each 
point  cloud  needs  to  be  corrected  with  removing  those  points  that  correspond  to  other  category. 
Typically,  the  tree  canopy  can  overlap  to  another  land  type  area,  and  these  points  are  removed.  In 
addition,  in  the  tree  category,  the  grass  is  also  removed.  Note,  in  this  study,  the  multiple 
detections  are  not  used;  they  are  known  to  be  able  to  predict  the  laser  beam  backscattering  from 
trees.  Thus  points  with  than  one  number  of  returns  are  eliminated.  The  return  number  is 
extrcated  from  the  LAS  file.  After  removing  these  points,  the  laser  beams  in  the  tree  category  are 
supposed  to  be  backscattered  from  the  canopies. 

The  corrected  point  clouds  can  be  seen  in  the  Figure  51.  These  point  sets  are  the  classifier  inputs 
and  the  categories  are  the  classes.  The  aim  of  the  investigation  is  to  find  the  best  classifier  for 
predicting  the  class  from  the  inputs. 
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Figure  50  Land  type  categories  (classes) 


71 


Exploitation  of  Full- waveform  LiDAR  to  Characterize/Exploit  under  Canopy  Targets  -  Foliage  Penetration  (FOPEN) 


4.2305 

4.2305 

4.2305 

4.2305 

4.2305 

E, 

4.2305 

4.2305 

4.2305 

4.2305 

4.2305 

4.2304 


Figure  51  Land  type  classes  after  filtering 

Median  waveforms 

First,  the  sample  vectors  of  the  classes  were  examined.  The  median  waveforms  from  the  60 
sample  input  for  each  class  is  shown  in  Figure  52.  Note  that  the  maximum  intensity  can  be  used 
for  distinguishing  some  classes  and  that  the  standard  deviation  at  each  pulse  is  also  large  making 
the  classification  a  challenge. 

To  compare  the  median  sample  vectors,  see  in  the  left  part  of  Figure  53.  The  range  of  the  median 
absolute  deviation  is  depicted  by  the  dashed  lines.  Clearly,  the  maximum  intensity  helps  separate 
classes  road2  and  grass  from  classes  roadl,  tree  and  building.  Also  note  that  the  tree  median 
sample  also  has  a  different  tail  shape  than  roadl,  road2  and  the  building.  Unfortunately,  this  tail 
shape  is  similar  to  that  of  the  roadl,  though  the  amplitude  is  lower.  The  two  class  regions  of  the 
median  deviation  are  overlapping  with  each  other,  which  can  cause  problems  in  the  class 
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separation.  Also  note  that  the  shape  Gaussian  parameter  (p5)  is  different  in  the  roadl  class  than 


in  any  other  classes. 
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Figure  52  Statistics  of  sample  vectors  per  class 


The  classification  bases  on  these  median  waveforms  are  shown  by  the  right  part  of  Figure  53. 
Note  that  there  is  pattern  not  representing  any  categories.  The  same  pattern  was  seen  in  the  4.5.2, 
Clustering  with  SOM  neural  network,  section.  Both  observations  confirm  the  use  of  the 
translated  sample  vectors  instead  of  original  sample  vectors. 

The  median  sample  vector  from  the  translated  sample  vectors  can  be  seen  in  the  Figure  54.  Note 
that  the  shapes  of  the  waveforms  are  nearly  the  same,  suggesting  that  the  maximum  peak 
location  difference  is  caused  by  the  different  shapes  of  median  waveform  from  the  original 
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sample  vectors  (presented  in  Figure  53).  Also  note  that  the  distinguishable  tail  shape  of 


vectors  from  the  tree  class  is  still  present. 
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Figure  53  Median  sample  vectors  from  translated  sample  vectors  samples  (left),  and  the  classification  results  (right) 


Figure  54  Median  sample  vectors  from  translated  sample  vectors 


74 


Exploitation  of  Full- waveform  LiDAR  to  Characterize/Exploit  under  Canopy  Targets  -  Foliage  Penetration  (FOPEN) 


In  the  Figure  54,  it  is  seen  that  the  median  standard  deviation  is  still  large,  and  they  are 
overlapping  each  other.  In  order  to  determine  how  these  median  vectors  describe  the  classes,  the 
classification  by  these  common  waveforms  is  done.  The  distance  between  the  samples  and  the 
median  waveforms  are  measured  by  the  maximum  norm,  because  it  is  found  to  be  the  best 
distance  definition  in  these  circumstances.  In  order  to  increase  the  classification  reliability,  the 
approach  presented  in  4.4.1  section  was  extended:  the  minimal  threshold  is  applied  to  accept  if 
the  sample  belongs  to  the  selected  class.  This  threshold  requires  a  minimal  distance  between  the 
sample  and  the  class  to  accept  the  class.  The  threshold  is  the  N  times  the  standard  deviation  of 
the  distances.  Those  samples  of  which  distances  are  larger  than  the  threshold  will  not  be 
classified. 


The  overall  classification  errors  and  the  ratio  of  the  positive  and  negative  mismatches  can  be 
seen  in  Figure  55.  On  the  X  axis  the  N  value  shows  the  threshold  and  Y  axis  is  the  matching  and 
mismatching  ratios.  Also  shown  is  the  number  of  the  remaining  sample  numbers  (i.e.  data  ratio), 
because  increasing  the  threshold  causes  to  decrease  the  number  of  those  points  that  are  able  to  be 
examined. 


—  Data  ratio 
^—Overall  matches 
---roadl  false  positive 
--  road2  false  positive 
grass  false  positive 
tree  false  positive 
building  false  positive 

♦  roadl  false  negative 

♦  road2  false  negative 

♦  grass  false  negative 
tree  false  negative 
building  false  negative 


Figure  55  Distances  from  median  waveform 
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Figure  55  shows  that  the  ratio  of  the  overall  matches  is  53%  and  it  is  increasing  with  decreasing 
the  threshold. 


The  confusion  matrices  at  no  threshold  (i.e.  infinity  threshold),  0.5  and  0.2  sigma  are  shon  in  the 
Table  20,  Table  21,  and  Table  22,  respectively.  Note  that  without  any  threshold,  the 
classification  error  is  54%.  Not  that  it  is  already  better  than  select  the  class  randomly  (20%).  This 
error  can  achieve  71.3%,  but  note  that  practically  the  classifier  only  works  for  the  roadl,  grass 
and  tree  classes.  The  results  at  other  thresholds  also  indicate  that  this  classifier  only  works 
correctly  for  these  classes. 


Table  20  Classification  by  the  distance  from  the  median  waveform  without  threshold 


roadl 

road2 

grass 

tree 

building 

False 

negative 

2220 

269 

45 

625 

150 

roadl 

32.9% 

43.1% 

2.3% 

0.8% 

9.9% 

3.6% 

128 

5126 

565 

351 

116 

road2 

18.5% 

1.3% 

51.8% 

7.2% 

3.7% 

1.6% 

41 

2525 

915 

178 

20 

grass 

75.1% 

0.5% 

25.5% 

18.9% 

2.5% 

0.4% 

761 

363 

372 

1927 

371 

tree 

49.2% 

10.7% 

3.0% 

6.8% 

35.2% 

8.4% 

916 

458 

192 

519 

343 

building 

85.9% 

16.4% 

4.3% 

4.4% 

9.4% 

11.1% 

45.4% 

41.4% 

56.2% 

46.5% 

65.7% 

54.0% 
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Table  21  Classification  by  the  distance  from  the  median  waveform  with  at  0.5  sigma  threshold 


roadl 

road2 

grass 

tree 

building 

False 

negative 

1124 

18 

5 

198 

101 

roadl 

22.3% 

68.9% 

0.7% 

0.3% 

10.8% 

6.3% 

9 

591 

102 

21 

22 

road2 

20.7% 

0.4% 

46.3% 

10.7% 

1.6% 

2.3% 

6 

427 

129 

4 

5 

grass 

77.4% 

0.3% 

33.7% 

17.1% 

0.3% 

0.6% 

96 

54 

54 

328 

90 

tree 

47.3% 

5.2% 

3.2% 

6.1% 

37.5% 

11.6% 

74 

32 

22 

30 

29 

building 

84.5% 

5.2% 

2.5% 

4.6% 

4.1% 

7.2% 

14.1% 

47.3% 

58.7% 

43.5% 

88.3% 

61.6% 

Table  22  Classification  by  the  distance  from  the  median  waveform  with  0.2  sigma  threshold 


roadl 

road2 

grass 

tree 

building 

False 

negative 

0 

0 

0 

4 

roadl 

3.1% 

0.0% 

0.0% 

0.0% 

3.0% 

IfJSSSaKS  1 

39 

3 

0 

0 

road2 

9.3% 

0.6% 

40.6% 

5.5% 

0.0% 

0.0% 

ilSKOni  i 

47 

9 

0 

1 

grass 

84.5% 

0.5% 

45.6% 

14.1% 

0.0% 

1.5% 

MSI!  6 

6 

1 

19 

1 

tree 

42.4% 

HEffg5jg«  3.6% 

5.0% 

2.1% 

57.6% 

2.5% 

5 

0 

2 

0 

2 

building 

77.8% 

3-5% 

0.0% 

9.1% 

0.0% 

13.3% 

mmm  9.4% 

57.6% 

40.0% 

0.0% 

75.0% 

71.3% 
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In  the  next  step,  the  extended  Gaussian  parameters  and  the  kurtosis-skewness  pairs  were  also 
examined.  The  histograms  of  the  Gaussian  parameters  by  the  classes  are  shown  in  Figure  56.  The 
histograms  of  the  kurtosis  and  skewness  values  are  in  Figure  57. 
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Figure  56  Histograms  of  the  Gaussian  parameters  by  the  categories 


The  linear  discriminant  analysis  (LDA)  was  applied  on  these  parameters  (Gaussian  +  kurtosis  + 
skewness).  The  confusion  matrix  of  the  result  is  shown  in  Table  23.  The  total  classification  error 
(60%)  provides  slightly  better  results  than  the  median  waveform  approach  when  no  threshold  is 
applied.  Also  this  method  can  only  distinguish  the  roadl,  road2  and  tree  classes;  it  cannot  handle 
the  classification  of  grass  and  buildings.  The  Gaussian  parameters  predict  that  the  grass  and  the 
dirt  road  (road2)  are  same. 
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Figure  57  Kurtosis  and  skewness 


Table  23  Linear  discriminant  analysis  of  the  Gaussian  parameters 


roadl 

road2 

grass 

tree 

building 

False 

negative 

2864 

367 

0 

12 

66 

roadl 

13.4% 

64.9% 

2.5% 

0.0% 

0.2% 

1.5% 

131 

5956 

0 

57 

142 

road2 

5.2% 

1.3% 

50.1% 

0.0% 

0.6% 

2.0% 

17 

3602 

0 

45 

15 

grass 

100.0% 

0.2% 

30.9% 

0.0% 

0.7% 

0.3% 

216 

606 

0 

2624 

348 

tree 

30.8% 

2.9% 

4.1% 

0.0% 

64.4% 

7.7% 

738 

1031 

0 

167 

492 

building 

79.7% 

13.0% 

8.0% 

0.0% 

3.2% 

16.4% 

27.8% 

48.5% 

0% 

9.7% 

53.7% 

61.2% 
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SOM 


Clustering  the  translated  sample  vectors  provides  better  separation  between  grass  and  roadl,  and 
to  select  buildings.  Two  SOM  configurations  were  tested.  The  first  is  3x3,  which  creates  9 
groups,  and  second  is  2x2,  which  creates  4  groups.  The  results  can  be  seen  in 
Figure  58. 


Group  # 

roadl 

road2 

grass 

tree 

building 

1 

8.4% 

0.7% 

1.1% 

17.7% 

72.0% 

2 

79.1% 

2.3% 

0.4% 

13.1% 

5.1% 

3 

4.6% 

17.5% 

5.2% 

38.4% 

34.4% 

4 

0.2% 

1.6% 

0.8% 

93.5% 

3.9% 

5 

1.2% 

44.0% 

44.7% 

7.0% 

3.1% 

6 

5.6% 

68.9% 

19.1% 

0.7% 

5.8% 

7 

4.9% 

14.8% 

4.1% 

45.6% 

30.6% 

8 

1.6% 

41.6% 

44.7% 

8.5% 

3.6% 

9 

4.3% 

70.3% 

18.8% 

0.8% 

5.8% 

♦  Group  1 


Group  # 

roadl 

road2 

grass 

tree 

building 

1 

64.2% 

1.4% 

0.5% 

18.0% 

15.9% 

2 

4.7% 

11.7% 

1.8% 

55.1% 

26.7% 

3 

1.9% 

38.8% 

43.5% 

10.4% 

5.4% 

4 

4.5% 

69.4% 

19.7% 

0.8% 

5.7% 
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Figure  58  Land  classification  by  SOM  with  4x4  neuron  configuration  (left)  and  2x2  neuron  configuration  (right) 


The  result  matrix  of  the  4x4  SOM  shows  that  Group  2  represents  roadl,  Group  1  represents 
building,  Groups  6  and  9  are  road2.  The  estimation  of  the  grass  is  available  from  Group  5  and 
Group  8.  The  misclassification  of  the  grass  can  be  decreased  using  the  Groups  6  and  9,  which 
indicate  that  the  samples  are  from  road2.  The  shape  of  the  common  sample  waveforms  (the 
weight  vectors)  shows  that  the  shape  of  Group  8  is  similar  to  that  of  the  median  waveforms  with 
original  sample  vectors. 


The  4  groups  produced  by  2x2  SOM  also  shows  better  classification  performance  between  road2 
and  grass  than  the  previous  methods,  but  the  results  of  other  classes  are  worse.  Overall,  the  SOM 
analysis  claims  that  the  main  features  of  the  waveforms  are  the  value  of  the  maximum  intensity. 


Combined  method 


The  above  presented  three  methods  (median  waveform  distances,  discriminant  analysis  and 
SOM)  are  combined  to  achieve  the  best  performance.  First,  the  algorithm  uses  Bayesian  decision 
to  classify  the  sample  vectors.  The  Bayesian  decision  is  based  on  the  confusion  matrix  provided 
by  the  median  waveform  distance  classifier  and  the  discriminant  analysis  of  the  Gaussian 
parameters.  After  that,  those  points,  which  are  selected  as  road2,  will  be  classified  again  by  the 
2x2  SOM.  In  this  step,  the  algorithm  distinguishes  road2  and  grass.  The  last  step  will  split  the 
previously  determined  roadl  class  to  roadl  and  building  classes.  The  steps  of  the  algorithm  are 
presented  by  Table  24. 


Table  24  Combined  method 


Method 

From 

To 

1 

Bayesian  decision  using  the  confusion 
matrices  of  median  waveforms  and 
discriminant  analysis 

all  vectors 

roadl, road2, 
grass,  tree, 
building 

2 

Select  grass  points  from  road2  with  2x2  SOM 

road2 

road2,  grass 

3 

Select  building  from  roadl  with  4x4  SOM 

roadl 

roadl,  building 

The  dataset  has  been  divided  into  train  and  validation  set.  The  training  set  included  the  70%  of 
the  total  dataset  and  the  validation  set  contained  the  remaining  30%.  The  confusion  matrices  that 
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are  used  for  the  Bayesian  decision  and  the  SOM  networks  are  calculated  from  the  training  set. 
The  solution  is  tested  on  both  sets.  In  order  to  improve  the  results,  we  applied  a  mode  filter  on 
the  dataset.  It  corrects  the  class  of  single  waveforms  by  the  mode  of  the  waveforms  found  within 
3  m.  Note  that  it  already  uses  the  location  of  the  waveforms  (i.e.  3D  point),  not  just  the  shape, 
thus,  it  is  an  improved  solution. 


The  results,  shown  in 


Figure  59  and  Figure  60  shows  65%  classification  error  on  the  train  and  the  validation  set.  After 
using  mode  filter,  the  performance  can  reach  73-75%.  Also  note  that  the  diagonal  elements  of  the 
confusion  matrix  contain  the  majority  of  the  points  within  the  classes,  which  means  that  the 
correct  matches  are  dominant  in  each  class. 
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Figure  59  Results  on  the  training  set  before  applying  mode  filter  (top)  and  after  if  (bottom) 
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5.  LiDAR  SYSTEM  PERFORMANCE  COMPARISONS 


As  LiDAR  technology  continues  to  develop,  new  approaches,  such  as  CW  FM  LiDAR 
(Continuous  Waveform  Frequency  Modulated),  are  introduced  and  improved  hardware  solutions 
are  used  in  the  next  generation  of  LiDAR  systems.  Research  started  to  investigate  the  impact  of 
the  new  solutions  on  waveform  formation  and  analysis.  The  lack  of  funding  prevented  to 
completion  of  this  task,  but  two  areas  have  been  partially  pursued  under  other  projects: 

•  The  CW  FM  system  from  Bridger  Photonics  is  of  interest  because  of  the  expectation  that 
this  approach  may  provide  better  penetration  capability  for  certain  materials,  such 
modestly  transparent  plastic,  glass,  and  other  materials.  In  addition,  this  sensing  approach 
provides  Doppler  observations  that  could  be  beneficial  to  separate  static  and  moving 
objects  in  the  image  area. 

•  LiDAR  systems  are  performance  validated  using  ground  control,  mainly  certain  materials 
arranged  in  simple  geometrical  shapes;  for  example,  circular  targets  with  material 
characterization.  Since  it  is  difficult  to  provide  identical  object  space  environmental 
conditions  to  repeat  tests,  variation  system  performance  is  rarely  validated.  Therefore,  it 
was  of  great  interest  to  analyze  LiDAR  data  simultaneously  acquired  by  two  differ 
LiDAR  systems. 

A  peer-reviewed  conference  paper  and  a  presentation  on  Bridger  Photonics  CW  FM  system, 
HRS-3D-1W,  providing  a  comparison  to  other  sensors,  and  a  journal  paper  using  the  two 
simultaneously  acquired  LiDAR  datasets,  focused  on  the  compression  aspect  of  LiDAR 
waveform,  are  included  in  the  Appendix. 
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6.  CONCLUSION  AND  FUTURE  TASKS 


In  these  investigation  data  acquired  by  a  Riegl  VZ400  digitizer  with  2  ns  sampling  rate  was  used. 
Our  theoretical  investigation  shows  that  in  the  300-600  m  instrument-to-target  range,  the  impact 
of  the  incidence  angle  is  in  the  -0.6-1. 2  ns  range  that  is  not  easily  to  detectable  with  this 
somewhat  modest  sampling  rate.  Note  that  airborne  scanners  typically  work  with  1  ns  sampling 
rate  (or  sometime  with  0.5  ns),  in  which  case,  the  incidence  angle  can  be  reasonable  estimated. 

Despite  the  less  than  ideal  sampling  rate  of  the  waveform,  the  investigation  results  suggest  that 
the  incidence  angle  can  be  determined  with  an  approximately  10°  standard  deviation. 

For  material  identification/detection,  the  reflectance  properties  of  the  target  object  were 
examined.  In  ScanPos2  and  ScanPos3  datasets,  a  small  variety  of  materials  was  represented. 
Using  a  simple  method  based  on  the  distance  between  an  incoming  waveform  and  the  average 
waveform,  2-3  reflectance  classes  could  be  easily  differentiated,  with  classification  preformance 
of  93. 1%  reliability. 

The  based  on  the  experiences  so  far,  the  validity  of  the  algorithms  has  been  checked  and  modest 
results  have  been  achieved,  which  is  due  to  the  modest  sampling  rate  of  the  waveform.  Note  that 
the  footprint  size  is  also  part  of  the  problem,  as  the  impact  of  incidence  angle  as  well  as  material 
properties  depends  on  the  divergence  of  the  laser  beam.  For  better  testing  the  methods,  additional 
measurements  are  required,  and  arrangement  is  in  place  to  acquire  airborne  waveform  data 
simultaneously  collected  by  three  scanners. 

With  the  arrival  of  two  new  data  sets,  acquired  by  two  different  airborne  scanners,  the 
investigation  continued  on  assessing  the  classification  performance  of  waveform  exploitation. 
The  current  results  indicate  that  the  conditions  of  normal  airborne  data  acquisitions  is  harsher 
compared  to  the  test  environment,  and,  consequently,  not  all  the  results  can  be  achieved  at  the 
level  of  the  controlled  environment  tests.  Increasing  the  algorithmic  complexity,  however,  the 
classification  results  can  improve. 
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ABSTRACT: 

Topographic  Light  Detection  and  Ranging  (LiDAR)  technology  has  advanced  greatly  in  the  past  decade.  Pulse  repetition  rates  of 
terrestrial  and  airborne  systems  have  multiplied  thus  vastly  increasing  data  acquisition  rates.  Geiger-mode  and  FLASH  LiDAR  have  also 
become  far  more  mature  technologies.  However,  a  new  and  relatively  unknown  technology  is  maturing  rapidly:  Frequency-Modulated 
Continuous  Wave  Laser  Detection  and  Ranging  (FMCW-LADAR).  Possessing  attributes  more  akin  to  modern  radar  systems,  FMCW- 
LADAR  has  the  ability  to  more  finely  resolve  objects  separated  by  very  small  ranges.  For  tactical  military  applications  (as  described 
here),  this  can  be  a  real  advantage  over  single  frequency,  direct-detect  systems.  In  fact,  FMCW-LADAR  can  range  resolve  objects 
at  10-7  to  10-6  meter  scales.  FMCW-LADAR  can  also  detect  objects  at  greater  range  with  less  power.  In  this  study,  a  FMCW- 
LADAR  instrument  and  traditional  LiDAR  instrument  are  compared.  The  co-located  terrestrial  scanning  instruments  were  set  up  to 
perform  simultaneous  3-D  measurements  of  the  given  scene.  Several  targets  were  placed  in  the  scene  to  expose  the  difference  in  the 
range  resolution  capabilities  of  the  two  instruments.  The  scans  were  performed  at  or  nearly  the  same  horizontal  and  vertical  angular 
resolutions.  It  is  demonstrated  that  the  FMCW-LADAR  surpasses  the  perfomance  of  the  linear  mode  LiDAR  scanner  in  terms  of  range 
resolution.  Some  results  showing  the  maximum  range  acquisition  are  discussed  but  this  was  not  studied  in  detail  as  the  scanners’  laser 
powers  differed  by  a  small  amount.  Applications  and  implications  of  this  technology  are  also  discussed. 


1.  INTRODUCTION 

1.1  Background 

Pulsed  light  detection  and  ranging  (LiDAR)  systems  for  terres¬ 
trial  surveying  and  topographic  mapping  have  made  great  strides 
in  the  past  10  to  15  years.  Although  terrestrial  laser  scanning 
was  first  used  in  the  1960’s(Shan  and  Toth,  2009),  only  recently 
have  the  scanning  mechanisms  and  computer  hardware  advanced 
enough  to  support  topographic  measurements  within  a  reason¬ 
able  amount  of  time.  Terrestrial  LiDAR  instruments  are  now 
able  to  collect  millions  of  accurately  georegistered  points  within 
several  seconds.  While  pulsed  LiDAR  technology  is  certainly 
more  mature  than  other  approaches,  frequency-modulated  con¬ 
tinuous  wave  (FMCW)  laser  detection  and  ranging  (LADAR)  is 
rapidly  gaining  ground  due  to  several  key  benefits.  First  and 
most  evident,  is  the  difference  in  range  resolution  between  the 
two  technologies.  Pulsed  LiDAR  is  limited  in  range  resolution 
by  the  width  of  the  emitted  laser  pulse.  A  typical  conventional 
LiDAR  system  emits  Gaussian- shaped  laser  pulses.  Meanwhile, 
FMCW-LADAR  is  limited  in  range  resolution  by  the  chirped 
bandwidth  of  the  emitted  beam(Reibel  et  al.,  2014).  An  exam¬ 
ple  of  a  chirped  bandwidth  from  a  FMCW-LADAR  system  can 
be  seen  in  Reibel  et  al(Reibel  et  al.,  2010),  Figure  2.  Secondly, 
FMCW-LADAR  can  detect  doppler  motions  in  the  returned  laser 
pulse.  Finally,  FMCW-LADAR  requires  less  power  to  achieve 
range  measurements  or,  conversely,  can  detect  objects  at  a  greater 
range  than  a  pulsed  LiDAR  having  the  same  power.  In  this  study, 
the  range  resolutions  of  a  conventional,  pulsed  LiDAR  system 
and  a  FMCW-LADAR  system  are  compared  and  contrasted  us¬ 
ing  real-world  targets. 


*  Corresponding  author. 


1.2  Instrument  specifications 

The  instruments  used  in  this  comparison  experiment  were  a  Riegl 
VZ-400  3D  terrestrial  laser  scanner  and  a  Bridger  HRS-3D-1W 
imager.  The  VZ-400  is  a  conventional,  pulsed  Class  I  scanning 
laser  system  with  a  360°  horizontal  and  100°  vertical  field-of- 
view  (FOV).  The  HRS-3D  is  a  FMCW  Class  Illb  scanning  laser 
system  with  a  360°  horizontal  and  60°  vertical  FOV.  The  phys¬ 
ical  dimensions  of  the  scanner  heads  are  very  similar  (see  Table 
1).  The  HRS-3D  has  a  separate  processing  unit  whereas  the  VZ- 
400  performs  its  processing  within  the  scanner  unit.  The  HRS-3D 
weighs  about  twice  as  much  as  the  VZ-400  and  requires  about  3 
times  as  much  power.  Both  scanners  are  easily  tripod  mounted. 

The  VZ-400  and  HRS-3D  lasers  both  emit  at  1.55  gm.  The  max¬ 
imum  pulse  repetition  rates  for  the  VZ-400  and  HRS-3D  are  300 
kHz  and  48  kHz,  respectively.  The  beam  divergences  for  the  VZ- 
400  and  HRS-3D  are  0.35  mrad  and  0.1  mrad,  respectively. 


Riegl  VZ-400 

Bridger  HRS-3D 

Scanner  head 
(h  x  dia) 

3 1  cm  x  1 8  cm 

28  cm  x  20  cm 

Processor  size 

None 

48  cm  x  43  cm  x  23  cm 

Weight 

9.6  kg 

19  kg 

Power 

65  W 

350  W 

Table  1 :  Size,  weight,  and  power  comparisons  of  the  two  scanners 


This  contribution  has  been  peer-reviewed, 
doi:  1 0.51 94/isprsarchives-XL-1  -233-201 4 


233 


The  International  Archives  of  the  Photogrammetry,  Remote  Sensing  and  Spatial  Information  Sciences,  Volume  XL-1 , 2014 
ISPRS  Technical  Commission  I  Symposium,  17-20  November  2014,  Denver,  Colorado,  USA 


Figure  1:  (Top)  Overview  of  scan  area  and  target  placements  (Middle  and  bottom  rows)  Images  of  targets  [No  image  of  target  7 
available] 


2.  DATA  COLLECTION 

For  this  series  of  tests,  we  explored  a  variety  of  tactical  scenarios 
and  targets  of  interest  to  compare  and  contrast  the  technologies. 
A  data  acquisition  site  was  chosen  at  approximate  coordinates: 
45.661°N,  111.345°W.  The  site  is  approximately  14  miles  west 
of  Bozeman,  MT.  A  Riegl  VZ-400  terrestrial  laser  scanner  and 
a  Bridger  Photonics  HRS-3D-1W  terrestrial  laser  scanner  were 
mounted  on  tripods  and  placed  side-by-side  (Figure  1,  top).  Sev¬ 
eral  targets  were  placed  in  the  scene.  Targets  1  through  7  were 
placed  at  ranges  between  approximately  200  and  250  meters.  Tar¬ 
gets  8  and  9  were  placed  at  68  m  and  45  m,  respectively.  The  tar¬ 
gets  were  designed  to  simulate  sniper  blinds  and  to  test  the  range 
ambiguity  issues  that  typically  arise  in  direct  detect  systems.  The 
targets  also  were  designed  to  evaluate  the  obscurant  penetration 
ability  of  the  scanners.  Camouflage  netting  and  various  types  of 
fencing  were  used  to  obscure  targets.  A  list  of  the  targets,  their 
descriptions,  and  their  ranges  from  the  scanners  are  shown  in  Ta¬ 
ble  2. 

The  scans  from  both  instruments  were  performed  with  the  mea¬ 
surement  angular  resolution  set  at  0.001°  x  0.001°.  The  excep¬ 
tion  was  for  Target  9  which  was  scanned  at  a  0.005°  x  0.005° 
resolution.  The  target  scans  were  not  made  simultaneously  to 
avoid  interference  introduced  by  each  of  the  system’s  lasers. 

3.  COMPARISONS 

The  range  ambiguity  and  obscurant  penetration  performance  of 
the  two  scanner  technologies  was  explored.  A  test  of  the  maxi¬ 
mum  range  of  the  scanners  was  also  made.  The  comparisons  be¬ 
tween  the  scanners  on  these  topics  are  discussed  in  the  following 
sections. 


Target  # 

Range(m) 

Description 

1 

220 

Sniper  blind  with  camou¬ 
flage  netting 

2 

185 

Sniper  blind  with  calibra¬ 
tion  pattern 

3 

249 

Chicken  wire  near  vege¬ 
tation 

4 

232 

Mesh  screen 

5 

250 

Chicken  wire  concealing 
mortar  stand 

6 

253 

Sniper  blind  concealing 
dummy  with  rocket 
launcher 

7 

238 

Fencing  in  front  of  tree 

8 

68 

Dummy  in  trench  with 
covered  with  fencing 

9 

45 

Tent  concealing  rifle  on 
tripod 

Table  2:  Target  descriptions  and  their  ranges  from  the  scanners 

3.1  Range  ambiguity 

Targets  1  and  2  were  chosen  to  study  the  range  ambiguity  of  the 
scanners  more  in-depth.  Target  1  was  covered  by  a  camouflage 
netting  and  had  a  wooden  vertical  support  stand  in  front  of  a  white 
painted  background  board.  The  front,  central  portion  of  Target  1 
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Figure  2:  (Top)  Point  clouds  of  Target  1  from  (left)  VZ-400  and  (right)  HRS-3D.  Insets  show  chosen  point  cloud  subsets  for  further 
analysis. 


was  isolated  for  the  following  analysis  of  both  point  clouds  as 
seen  in  the  insets  of  Fig  2.  The  histograms  of  Target  l’s  iso¬ 
lated  range  returns  for  the  VZ-400  and  HRS-3D  is  seen  at  top 
and  bottom  of  Figure  3,  respectively.  There  is  a  rather  stark  con¬ 
trast  between  the  two  range  return  histograms.  As  can  be  seen  in 
the  top  plot  of  the  VZ-400  range  return  histogram,  there  are  two 
range  return  peaks  at  approximately  220.6  m  and  220.75  m,  and 
a  smaller,  less  distinct  peak  at  approximately  220.95  m.  The  first 
two  peaks  are  due  to  the  netting  on  the  left  and  right  side  of  the 
vertical  support.  However,  the  netting  directly  in  front  of  the  sup¬ 
port  does  not  appear  as  a  return  in  the  data.  Instead,  the  returns 
appear  ambiguous  as  many  are  registered  between  the  netting  and 
support  stand.  The  support  stand  itself  is  the  ill-defined  peak  at 
220.95  m. 

Meanwhile,  the  HRS-3D  returns  rendered  significantly,  better- 
defined  surfaces  for  the  netting  and  vertical  support.  The  netting 
surface  is  the  peak  at  221  m  in  the  bottom  histogram  of  Figure 
3.  The  vertical  support  is  the  sharp  peak  at  221.5  m.  There  are 
only  a  few  stray  points  which  appear  between  the  netting  and 
vertical  support  surfaces  in  the  HRS-3D  data.  These  stray  points 
are  mainly  located  between  ranges  of  221.15  m  and  221.35  m. 

Two  sections  of  Target  2  were  chosen  to  compare  the  range  am¬ 
biguities  of  the  two  scanners.  The  isolated  sections  are  outlined 
in  white  boxes  in  4  and  the  insets  show  the  point  cloud  subsets. 
The  first  section  included  the  double-layered  mesh,  vertical  sup¬ 
port,  and  backboard  of  Target  2.  The  range  return  histograms  for 
the  first  section  are  shown  in  Figure  5  for  the  VZ-400  (top)  and 
HRS-3D  (bottom).  Interestingly,  the  VZ-400  histogram  shows  4 
distinct  peaks.  The  first  peak  at  184.5  m  is  due  to  the  double¬ 
layered  mesh  surface.  However,  the  double-layered  mesh  is  not 
resolved  by  the  VZ-400  into  two  separate  surfaces.  The  second 
and  third  peaks  are  located  at  185  m  and  185.15  m,  respectively. 
The  double  peaks  here  are  due  to  the  range  ’’pulling”  effect  in¬ 
duced  by  the  mesh.  This  effect  is  due  (in  part)  to:  1)  the  inability 
of  the  linear  mode  system  to  resolve  objects  outside  of  its  pulse 
bandwidth  and  2)  movement  of  the  mesh  material  during  the  scan 
event.  The  upper  portion  of  the  vertical  support  has  an  unabated 
line-of- sight  to  the  scanner  while  the  lower  portion  has  the  double 
layer  mesh  intervening  the  scanner’s  line-of- sight  to  the  vertical 
support.  The  mesh  interference  shifted  the  range  return  of  the 
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220.4  220.6  220.8  221  221.2 
Range  (m) 


Range  (m) 

Figure  3:  Range  return  histograms  of  Target  1  area  for  (Top)  VZ- 
400  and  (Bottom)  HRS-3D 


lower  portion  of  the  vertical  support  approximately  15  cm  closer. 
The  backboard  return  is  located  at  approximately  185.7  m.  There 
is  also  a  considerable  ”filling-in”  of  returns  between  the  mesh  and 
vertical  support  in  the  VZ-400  data  as  can  be  seen  between  184.5 
and  185  m. 
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Figure  4:  (Top)  Point  clouds  of  Target  2  from  (left)  VZ-400  and  (right)  HRS-3D.  Insets  show  chosen  point  cloud  subsets  for  further 
analysis. 
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Figure  5:  Range  return  histograms  of  Target  2  mesh,  vertical  sup¬ 
port,  and  backboard  for  (Top)  VZ-400  and  (Bottom)  HRS-3D 


The  HRS-3D  data  shows  three  distinct  range  return  peaks  around 
185  m.  Two  of  these  are  due  to  the  double-layered  mesh  (a  sepa¬ 
ration  of  only  a  few  cm)  while  the  third  peak  is  due  to  the  horizon¬ 
tal  frame  at  the  bottom  of  Target  2.  The  central  vertical  support  at 
185.55  m  and  the  backboard  at  186.2  m  are  both  clearly  defined 


in  the  range  returns.  There  are  a  few  spurious  returns  in  the  HRS- 
3D  data  located  at  185.3  m  range.  The  fill-in  observed  between 
the  vertical  support  and  the  backboard  at  ranges  of  185.7  m  to 
186  m  is  due  to  low-lying  vegetation. 


5 


Offset  (m) 

Figure  6:  Log  histograms  of  offset  from  Target  2  backboard  for 
(Top)  VZ-400  and  (Bottom)  HRS-3D 
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Figure  7:  Point  clouds  of  Targets  3  (top)  and  4  (bottom),  VZ-400 
and  HRS-3D  point  clouds  are  on  the  left  and  right  sides,  respec¬ 
tively. 


The  second  section  isolated  from  Target  2  is  the  backboard  con¬ 
taining  the  vertical  black  resolution  test  lines  as  can  be  seen  in 
Figure  4  top-left  insets.  For  each  of  the  sub-setted  point  clouds, 
a  vertical  plane  was  calculated  to  act  as  the  geometrical  repre¬ 
sentation  of  the  backboard.  The  distance  between  the  backboard 
plane  and  each  point  was  then  calculated.  This  distance  is  the 
offset  between  the  3-D  data  point  and  the  vertical  plane.  Loga¬ 
rithm  histograms  of  the  distance  offsets  are  shown  in  Figure  6  for 
the  VZ-400  (top)  and  HRS-3D  (bottom).  Negative  and  positive 
distance  offset  values  represent  points  behind  and  in  front  of  the 
backboard  from  the  scanners’  viewpoints,  respectively.  The  VZ- 
400  histogram  shows  a  distinct  negative  offset  shoulder.  This  is 
due  to  range  ambiguities  between  the  top  of  the  backboard  and  a 
tree  limb  located  approximately  0.5  m  behind  Target  2.  The  pos¬ 
itive  distance  offset  shoulder  shows  that  the  VZ-400  registered 
returns  out  to  8  cm  in  front  of  the  backboard.  Meanwhile,  the 
HRS-3D  histogram  shows  that  there  is  no  such  range  ambiguity 
between  the  backboard  and  the  tree  behind  Target  2.  Additionally, 
the  distance  offsets  are  confined  to  d=  5  cm  from  the  backboard 
plane. 

3.2  Obscurant  penetration 

In  this  section,  the  obscurant  penetration  performance  of  the  two 
scanners  is  compared.  Some  of  the  differences  have  already  been 
demonstrated  from  Targets  1  and  2  in  the  previous  section.  Fig¬ 
ures  7  and  8  show  the  remainder  of  the  targets’  point  clouds. 
The  VZ-400  point  clouds  are  on  the  left  side  while  the  HRS-3D 
point  clouds  are  on  the  right  side.  In  Target  3  (Figure  7  top),  it 
can  be  seen  that  the  VZ-400  point  cloud  (left)  is  far  sparser  than 
the  HRS-3D  point  cloud  (right)  behind  the  chicken  wire  fencing. 
Low-lying  vegetation  can  be  seen  in  the  HRS-3D  data  but  is  un¬ 
recognizable  in  the  VZ-400  data.  In  Target  4  (Figure  7  bottom),  it 
can  be  seen  that  both  scanners  could  not  easily  achieve  consistent 
penetration  through  the  mesh  screen.  However,  the  HRS-3D  was 
able  to  register  approximately  10%  of  the  total  returns  had  the 
intervening  screen  not  been  present.  The  VZ-400  did  not  register 
any  returns  from  behind  the  mesh  screen.  In  Target  5  (Figure  8a), 
the  VZ-400  did  not  register  any  returns  from  the  mortar  stand  be¬ 
hind  the  chicken  wire  (see  inset  for  top-down  view.  The  HRS-3D 
was  able  to  clearly  resolve  the  mortar  stand’s  two  legs  and  barrel. 
In  Target  6  (Figure  8b),  the  VZ-400  was  unable  to  register  the 
dummy  with  the  rocket  launcher  behind  the  sniper  blind.  Inter¬ 


estingly,  however,  the  intensity  image  of  the  dummy’s  head  and 
shoulders  can  be  seen  overlaid  on  the  sniper  blind.  The  HRS-3D 
was  able  to  easily  resolve  the  dummy’s  head,  torso,  and  limbs  and 
was  able  to  resolve  the  barrel  of  the  rocket  launcher.  Both  scan¬ 
ners  were  able  to  easily  penetrate  the  fencing  of  Target  7  (Figure 
8c).  However,  the  HRS-3D  was  slightly  more  successful  than  the 
VZ-400  at  registering  points  on  the  actual  fenceline.  In  Target  8 
(Figure  8d),  the  VZ-400  was  unable  to  geometrically  resolve  the 
dummy  in  the  ditch  under  the  fencing.  However,  similar  to  Target 
6,  the  dummy’s  head  appears  as  higher  intensity  points  mapped 
onto  the  surface  of  the  obscuring  fenceline.  The  HRS-3D  is  able 
to  geometrically  resolve  the  dummy’s  head,  torso,  and  arms.  Fi¬ 
nally,  in  Target  9  (Figure  8e),  both  scanners  are  able  to  penetrate 
the  tent  lining  to  resolve  the  propped-up  rifle  inside.  The  HRS- 
3D  resolves  features  such  as  the  scope  and  the  shoulder  strap. 
The  VZ-400  does  a  better  job  at  registering  the  intensity  changes 
of  the  tent  door  flap. 

3.3  Maximum  range 

Several  measurements  were  made  in  this  study  to  assess  the  max¬ 
imum  range  capabilities  of  the  two  scanners.  The  VZ-400  and 
HRS-3D  were  both  capable  of  registering  returns  from  highly  re¬ 
flective  surfaces  at  a  range  of  900  m.  The  surfaces  were  primar¬ 
ily  vegetation  and  thus  possessed  exceptional  reflectance  (0.7)  at 
1.55  ii m.  The  HRS-3D  registered  approximately  twice  as  many 
returns  than  the  VZ-400  at  these  long  ranges. 

4.  ANALYSIS 

From  the  standpoint  of  tactical,  military  targeting  and  topographic 
rendition,  it  is  clear  from  this  study  that  the  FMCW  technology 
vastly  improves  range  ambiguity  and  obscurant  penetration  of  ter¬ 
restrial  laser  scanners.  The  returns  from  closely  spaced  surfaces 
(down  to  approximately  10  cm  in  separation)  are  capable  of  being 
range-resolved  by  the  HRS-3D.  In  contrast,  the  VZ-400  begins  to 
experience  range  ambiguity  problems  when  surface  separations 
are  at  approximately  0.5  m  or  less.  This  is  especially  true  for 
closely  spaced  surfaces  at  larger  ranges,  as  can  be  seen  in  Targets 
1,  2,  5,  and  6.  The  VZ-400  tends  to  do  a  better  job  at  range  res¬ 
olution  and  obscurant  penetration  when  the  targets  are  closer  in 
range,  as  in  Target  9.  Interestingly,  the  VZ-400  scanner  recorded 
higher  intensity  values  from  the  concealed  objects  mapped  onto 
the  surfaces  of  the  obscurants  as  in  Targets  6  and  8.  It  may  be  the 
case  that  the  concealed  object’s  Gaussian  return  shoulder  is  hav¬ 
ing  an  additive  effect  to  the  obscurant’s  surface  return.  Yet,  the 
concealed  object’s  Gaussian  peak  is  not  strong  enough  to  be  reg¬ 
istered  as  its  own  3-D  return  by  the  sensor.  It  was  also  observed 
that  the  HRS-3D  displayed  remarkable  vegetation  feature  reso¬ 
lution  at  larger  ranges.  Tactically,  this  is  particularly  important 
in  foliage  penetration(Massaro  et  al.,  2012)  and  the  rendition  of 
camouflage.  It  is  believed  that  FMCW-LADAR  systems  will  be 
more  adept  at  highly  accurate  vegetation  mapping(Pirotti  et  al., 
2013). 

5.  CONCLUSION 

While  the  FMCW  system  certainly  outperformed  the  conventional, 
pulsed  system  in  terms  of  range  resolution,  obscurant  penetration, 
and  maximum  range,  there  are  some  drawbacks  to  the  FMCW 
system.  It  currently  takes  about  six  times  as  long  for  a  3-D  scene 
to  be  acquired  with  the  FMCW  system.  The  FMCW  scene  acqui¬ 
sition  time  can  likely  be  decreased  however  with  improved  FPGA 
boards  and  optimized  scanning  techniques.  The  size,  weight,  and 
power  requirements  of  the  FMCW  system  are  also  significantly 
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(a)  Target  5  (insets  are  top-down  views) 


(b)  Target  6  (insets  are  top-down  views) 


(c)  Target  7 


(d)  Target  8  (insets  show  close-up  of  dummy  location) 


(e)  Target  9  (insets  show  points  from  tent  interior) 


Figure  8:  Point  clouds  of  Targets  5  through  9  from  top  to  bottom.  VZ-400  and  HRS-3D  point  clouds  are  on  the  left  and  right  sides, 
respectively. 
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greater  than  the  slimmer  conventional  system.  However,  this  can 
be  remedied  by  eventual  inclusion  of  the  processing  unit  into  the 
scanner  head. 

As  compared  to  heterodyne  laser  radar  systems  nearly  30  years 
ago(Keyes,  1986),  FMCW-LADAR  seems  to  have  overcome  many 
technological  hurdles.  In  all,  it  seems  clear  that  the  future  of  ter¬ 
restrial  laser  scanning,  and  possibly  air-  and  space-borne  LADAR, 
lies  in  FMCW  systems.  In  addition  to  the  superiority  of  their 
range  resolution  and  obscurant  penetration,  FMCW  systems  can 
provide  Doppler  velocities  and  theoretically  achieve  greater  max¬ 
imum  range  given  the  same  power  input. 
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Feasibility  study  to  detect  moving  vehicles: 

❖  Prototype/test  a  mobile,  easily  deployable  sensor  system 

■  Main  component:  laser  sensing  technology 

■  GPS-based  georeferencing  and  time  base 

■  Optional  optical  sensing 

❖  Acquire  sample  data 

❖  Processing 

■  Develop  vehicle  specific  feature  extraction  algorithm 
to  extract  shape  form  laser  point  cloud 

■  Parameterize  extracted  geometric  parameters  with 
accuracy  term  estimates 

■  Estimate  velocity  and  attitude  of  vehicle 

■  Model  vehicle  trajectory 

❖  Report  on  results  and  recommendations 
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Initial  Sensor  Platform  and  Configuration 


❖  Install  sensors  in  the  OSU  GPSVan 

❖  High-performance  georeferencing  with  data  acquisition 

❖  Suitable  platform  to  install  imaging  sensors 


Multiple  GPS 
antennae 
Roof  platform  for 
imaging  sensors 
High-performance 
IMU  sensors 
Data  acquisition 
system,  power 
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Laser  Sensor  Comparative  Tests 


❖  Laser  technology  is  rapidly  advancing 

❖  Application  specific  sensors  are  introduced  (UAS) 

❖  Performance  evaluation  needed 

■  to  optimize  sensor  selection 

■  to  provide  reference  for  low-end  sensors 

❖  Profilers/scanners  considered: 


SICK  LMS30206 

UTM-30LX-EW 

Velodyne  HDL-64E 

SiCK 


IbeoAlasca  XT 


Velodyne  HDL-32E  Bridger’s  HRS-3D-1W 
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Test  Site:  OSU  Airport 


Site  Location:  OSU  Airport 
❖  RWY  9R-27L 

■  5002’ X  150’ 

■  Precision  Instrument  Marking 

■  Utilized  by  aircraft  up  to  RDC  C-l  1 1 
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Some  Characteristics  of  Laser  Scanning 


❖  Side  effect:  motion  artifact;  objects  change  shape  based 
on  relative  motion  between  sensor  and  object 


❖  Benefit:  from  motion  artifact,  shape  deformation,  relative 
velocity  can  be  estimated  (object  shape  must  be  known) 
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Some  Aspects  of  Processing 


❖  Investigation  of  data  acquisition  and  sensing  properties 

■  Point  cloud  generation 

■  Range  and  surface  dependency 

■  Sample  time  dependency 

■  Observability  analysis 

❖  Shape  parameterization  based  on  point  cloud 

■  Estimation  accuracy 

■  Shape  matching 

❖  Sensor  network  formation 

■  Sensor  synchronization 

■  GPS-based  georeferencing,  calibration 

❖  High-level  processing 

■  Combining  multiple  sensor  data 

■  Vehicle  trajectory  and  velocity  estimation 
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❖  Field  tests  at  Don  Scott  airport  (OSU),  August  6  and  8,  2014 

❖  Crew:  OSU  and  Asymmetric  Inc.,  first  user  of  Bridgets  HRS-3D-1 W 


8 


OHIO 

SPtfE 

UNIVERSITY 


Sensor  Configuration 


9 


T  •  H  •  E 

OHIO 

SIATE 

UNIVERSITY 


_ Name _ 

Manufacture 

Data  interface 

Measuring 
Angular  range 

Multi-layer 

Multi-target 

Rotation  frequency 

Angular  resolution 

Max.  range 

Accuracy 

Hardware  elements 

Power  supply 
Weight 


Sensors’  Specification 


LMS-221  -30206 

HDL-32E 

HRS-3D-1W 

SICK  AG 

Velodyne 

Bridger  Photonics  Inc. 

RS232  or  RS422 

Ethernet  cable  (UDP/IP) 

TOF 

TOF 

FM  CW 

100°/ 180° 

360° 

No 

Yes,  32  scan  planes  (+10°  - 
+30°) 

No 

No 

N/A 

10  Hz 

>0.25-1° 

1 .33°  vertical,  0.6  0 
horizontal 

30  m  with  1 0%  reflectivity 

70  m 

>2  km  (10% 
reflectivity) 

10  mm  ±  35  mm 

+-2  cm  at  10Hz  (one  sigma 
at  25  m) 

sensor(s)  +  LMI  (optional) 

sensor  +  interface  box  + 

GPS  +  IMU 

24  V  DC  20W  1 .8A+ 
(heating:  24  V  DC  MOW  6A) 

9-32  V  DC  (1A,  12  W) 

9  kg 

<2  kg 

1 
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Scanning  static  aircraft  (August  6,  2014) 
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Data  Acquisition  Trajectory 
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GPSVan  trajectory;  surveyed  Cessna  aircraft  marked 
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Scanning  Aircraft  on  the  Tarmac 


Scanning  static  aircraft  (August  8,  2014) 
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Scanning  Aircraft  on  the  Runway 


❖  Scanning  moving  aircraft 
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Profiles  in  a  Global  Frame 
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Accuracy  Test 
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[m] 
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Manufacturer’s  specification:  <  1  cm  +  3.5  cm 
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Scanner  vs  Profiler  Data 


Profiles  at  Different  Distances 
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Accuracy  Analysis:  Velodyne  HDL-32E 


-  —  STD 

-  —  MAD 


30  40  50 


/ 

/ 

/ 


/ 

/ 


/  r 


/ 


/ 

*  / 
/ 

/ 

/ 


/ 


/ 


60  70 


Distance  from  the  wall  [m] 


Manufacturer’s  specification: 

<  2  cm  (<25  m) 


80 


21 


Exploitation  of  Full- waveform  LiDAR  to  Characterize/Exploit  under  Canopy  Targets  -  Foliage  Penetration  (FOPEN) 
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Summary:  In  contrast  to  LiDAR  data  provided  by 
discrete  return  systems,  full  waveform  LiDAR  data 
(FWD)  improve  the  quality  of  products  and  extend 
the  possibilities  of  their  application.  Beside  evident 
benefits,  FWD  imposes  strong  requirements  on  the 
sensor  acquisition  and  storage  hardware.  At  the 
moment,  there  is  little  effort  reported  on  sensor 
level  waveform  data  compression.  Vendor  specified 
waveform  data  formats  are  generally  not  published 
for  the  users  and  do  not  mention  compression  op¬ 
tions.  Since  the  recorded  waveform  is  intrinsically 
noisy,  there  is  less  practical  need  to  use  lossless 
compression  methods.  As  long  as  the  properties  of 
FWD  are  preserved,  in  other  words,  as  long  as  it  is 
possible  to  extract  the  same  FWD  features,  and  the 
compression  noise  is  below  or  comparable  to  the 
noise  of  the  signal,  lossy  compression  methods  are 
suitable.  Such  compression  of  FWD  was  studied  in 
previous  work  where  waveforms  were  compressed 
individually  or  in  groups  forming  images,  which  is 
considered  as  ID  and  2D  compression,  respective¬ 
ly.  This  work  presents  a  strategy  for  FWD  compres¬ 
sion  that  is  based  on  multi-component  transforms, 
which  is  included  in  JPEG-2000  Standard  Part  2. 
This  extension  to  JPEG-2000  Standard  exploits  the 
3D  correlation  between  waveform  samples  and  al¬ 
lows  compressing  waveform  cubes  without  organ¬ 
izing  samples.  The  results  of  this  study  indicate 
that  the  removal  of  data  redundancies  in  all  three 
dimensions  results  in  slightly  better  compression 
performance  than  using  ID  or  2D  approaches. 
More  importantly,  the  user  has  the  flexibility  to  de¬ 
cide  on  how  much  the  data  should  be  compressed  or 
what  level  of  the  reconstruction  error  is  allowed. 
Besides  JPEG-2000  compression,  this  investigation 
includes  experiments  with  additional  data  decorre¬ 
lators,  such  as  Karhunen-Loeve  transform  and 
wavelet  transform.  The  conclusion  of  this  study  is 
that  the  JPEG-2000  Standard  is  an  effective  method 
for  FWD  compression  of  waveform  cubes,  result¬ 
ing  in  high  compression  ratios  and  low  data  degra¬ 
dation. 


Zusammenfassimg:  Untersuchung  zur  Kompres¬ 
sion  von  Full  Waveform  LiDAR-Daten  auf  Sensor- 
ebene  unter  Verwendung  der  JPEG-2000  Multi- 
komponenten-Transformation.  Im  Gegensatz  zu 
den  ublichen  Discrete  Return  Systemen  konnen 
Full  Waveform  LiDAR-Daten  (FWD)  die  Qualitat 
der  Produkte  verbessern  und  erweitern  somit  ihre 
Anwendungsmoglichkeiten.  Neben  diesen  offen- 
sichtlichen  Vorteilen  stellen  FWD  sehr  hohe  Anfor- 
derungen  an  den  Sensoraufbau  und  die  verfugbare 
Speicherkapazitat.  Bisher  gibt  es  noch  wenige  Ar- 
beiten  zur  Datenkompression  der  FWD-Daten  auf 
Sensorebene.  Herstellerspezifische  Full  Waveform 
Datenformate  werden  in  aller  Regel  nicht  dem  An- 
wender  zur  Verfugung  gestellt  und  erwahnen  keine 
Moglichkeit  der  Datenkompression.  Da  die  aufge- 
zeichneten  FWD  ohnehin  verrauscht  sind,  ist  es 
nicht  notig,  eine  verlustfreie  Kompression  zu  ver- 
wenden.  Solange  die  Eigenschaften  der  FWD  er- 
halten  bleiben,  das  heiBt,  dieselben  FWD-Merkma- 
le  extrahiert  werden  konnen  und  das  Kompres- 
sionsrauschen  unter  oder  vergleichbar  dem  Signal- 
rauschen  ist,  konnen  auch  verlustbehaftete  Kom- 
pressionsmethoden  genutzt  werden.  Diese  Art  der 
FWD -Kompression  ist  aus  vorherigen  Studien  als 
ID-  oder  2D -Kompression  bekannt,  bei  denen  ein- 
zelne  oder  Gruppen  von  Wellenformen  als  Bilder 
interpretiert  und  komprimiert  werden.  In  dieser 
Arbeit  wird  eine  Strategic  zur  FWD -Kompression 
prasentiert,  welche  auf  einer  Multikomponenten- 
Transformation  basiert  und  im  JPEG-2000  Stan¬ 
dard  Teil  2  beschrieben  ist.  Diese  ist  eine  Erweite- 
rung  zum  JPEG-2000-Standard,  die  die  3D-Korre- 
lation  zwischen  Waveform-Proben  auswertet  und 
damit  die  Kompression  eines  kompletten  Wave- 
form-Kubus  ohne  Proben  als  Startwerte  erlaubt. 
Die  Ergebnisse  dieser  Studie  zeigen,  dass  das  Ent- 
fernen  von  Redundanzen  in  alien  drei  Dimensionen 
zu  einer  geringfugig  besseren  Kompression  als  die 
ausschliefiliche  Nutzung  von  ID-  oder  2D-Infor- 
mationen  fuhrt.  Zusatzlich  eroffnet  sich  dem  Nut- 
zer  jedoch  die  Moglichkeit  zu  entscheiden,  wie 
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stark  die  Daten  komprimiert  werden  sollen  oder 
welcher  maximale  Fehler  bei  der  Rekonstruktion 
zugelassen  wird.  Neben  der  JPEG-2000-Kompres- 
sion  beinhaltet  unsere  Untersuchung  Experimente 
mit  zusatzlicher  Datendekorrelation  wie  der  Kar- 
hunen-Loeve -Transformation  und  der  Wavelet- 


Transformation.  Das  Ergebnis  dieser  Studie  zeigt, 
dass  der  JPEG-2000 -Standard  eine  effektive  Me- 
thode  fur  FWD-Kompression  in  Form  eines  Wave- 
form-Kubus  bietet.  Daraus  resultiert  eine  hohe 
Kompressionsrate  bei  nur  geringem  Qualitatsver- 
lust  der  Daten. 


1  Introduction 

The  hardware  developments  of  laser  scanning 
technology  continuously  provide  new  applica¬ 
tion  possibilities,  though,  limitations  and  diffi¬ 
culties  are  frequently  encountered  in  the  intro¬ 
duction  phase.  Starting  from  2004,  when  the 
full-waveform  digitization  became  available 
for  the  commercial  airborne  scanning  systems 
(Hug  et  al.  2004,  Mallet  &  Bretar  2009), 
improvements  of  quality  of  LiDAR  data  and 
products  have  been  observed.  The  main  ad¬ 
vantages  of  the  full  waveform  data  (FWD) 
are:  (1)  denser  and  more  accurate  point  cloud 
generation  (Mallet  &  Bretar  2009,  Par¬ 
rish  &  Nowak  2009),  (2)  improved  results  in 
vegetation  mapping,  e.g.  for  forestry  applica¬ 
tions  (Pirotti  2011),  and  (3)  better  point  cloud 
classification  (Reitberger  et  al.  2008,  Toth 
et  al.  2010,  Heinzel  &  Koch  2011,  Mallet  et 
al.  2011).  Despite  of  the  advantages  of  FWD, 
technology  limitations  are  still  present.  For 
example,  waveform  data  may  not  be  record¬ 
ed  at  maximum  pulse  rate  designated  for  the 
discrete  return  systems;  there  is  also  a  lack  of 
vendor  independent  tools  for  waveform  data 
processing;  finally,  the  most  common  problem 
is  the  amount  of  FWD.  Typically,  FWD  binary 
files,  e.g.  Riegl  SDF,  are  3-4  times  larger  than 
the  binary  files  (LAS)  containing  correspond¬ 
ing  point  clouds.  Although  storage  technolo¬ 
gies  continue  to  develop,  allowing  for  faster 
read/write  operation  and  accommodation  of 
larger  data  volumes,  the  drawback  of  the  in¬ 
creasing  size  of  FWD  continues  to  be  an  issue. 

Most  of  the  activities  of  LiDAR  data  com¬ 
pression  are  concerned  with  the  reduction  of 
the  point  cloud  size  to  better  support  the  dis¬ 
semination  of  primary  LiDAR  products.  The 
most  widely  used  solutions  in  practice  are  the 
lossless  LASzip  (Isenburg  2011),  LASCom- 
pression  (GeMMA  Lab  2009,  Mongus  &  Zalik 
2011)  and  the  lossy/lossless  LiDAR  Compres¬ 


sor  (Lizardtech  2014).  Other  approaches  are 
hardware  accelerated  compression  (Biasizzo 
&  Novak  2013)  or  considering  the  point  cloud 
thinning  as  lossy  compression  in  topographic 
applications  (Pradhan  et  al.  2005).  Although 
the  need  for  FWD  compression  is  indisput¬ 
able,  there  is  rather  limited  research  related 
to  sensor  level  waveform  compression.  Work 
on  compressing  each  waveform  separately 
(Toth  et  al.  2010)  and  exploiting  2D  correla¬ 
tion  between  waveform  samples  for  more  ef¬ 
ficient  compression  is  reported  in  Jozkow  et 
al.  (2015).  Bunting  et  al.  (2013)  proposed  the 
Sorted  Pulse  Data  (SPD)  format  for  storing 
LiDAR  data,  implemented  in  the  open  source 
software  library  SPDLib  (SPDlib  2013).  SPD 
follows  the  Hierarchical  Data  Format  version 
5.0  (HDF5)  (Koranne  2011,  The  HDF  Group 
2014)  which  supports  lossless  compression 
using  the  zlib  deflate  algorithm  (The  Internet 
Engineering  Task  Force  (IETF)  1996).  Since 
this  compression  does  not  exploit  LiDAR  data 
properties,  e.g.  spatial  or  temporal  correlation, 
the  compression  ratio  is  likely  to  be  limited. 
Due  to  the  complexity  of  data  stored  in  SPD, 
it  is  difficult  to  compare  the  compression  ra¬ 
tios  with  those  obtained  using  other  methods. 
Though  there  are  results  reported  on  SPD  file 
compression,  they  are  based  on  limited  ex¬ 
periments  (Bunting  et  al.  2013).  Another  pro¬ 
posed  waveform  exchange  standard,  Pulse- 
Waves  (Isenburg  2014)  provides  an  option  for 
file  compression,  but  it  is  still  in  the  develop¬ 
ment  phase,  so  details  are  unknown  at  the  mo¬ 
ment.  Additionally,  waveform  decomposition 
for  a  sum  of  components  or  echoes  (Chauve 
et  al.  2007,  Mallet  &  Bretar  2009)  also  re¬ 
presents  lossy  FWD  compression  allowing  for 
reconstruction/decompression.  Obviously,  the 
compression  rate  and  data  distortion  strongly 
depend  on  the  number  of  detected  echoes.  Ac¬ 
cording  to  the  authors’  knowledge,  there  is  no 
work  published  assessing  the  performance  in 
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terms  of  ratio  and  reconstruction  noise  of  the 
lossy  compression  represented  by  waveform 
decomposition  parameters. 

Since  there  is  little  need  for  lossless  com¬ 
pression  of  FWD  (Jozkow  et  al.  2015),  this  pa¬ 
per  investigates  a  lossy  compression  approach 
which  employs  the  extensions  of  JPEG-2000 
Standard  Part  2  of  multi- component  trans¬ 
form.  This  transform  allows  exploiting  the 
correlation  of  waveform  samples  along  three 
directions:  waveform,  scan  line,  flight  line, 
and  then  compresses  the  entire  waveform 
cube  without  the  need  of  separating  and  com¬ 
pressing  single  waveforms  or  arranging  them 
into  groups,  etc.  In  relation  to  ID  (Toth  et  al. 
2010),  and  2D  (Jozkow  et  al.  2015)  compres¬ 
sion  of  FWD,  the  expected  gain  of  the  ap¬ 
proach  proposed  here  is  a  higher  compression 
ratio,  as  the  data  redundancy  may  be  better 
removed  by  considering  full  spatial  (3D)  cor¬ 
relation  of  the  waveforms.  Additionally,  two 
other  transforms  for  FWD  decorrelation  were 
tested  to  investigate  whether  the  reported  high 
performance  of  compression  based  on  JPEG- 
2000  Standard  could  be  further  improved. 

2  Waveform  Data  Arrangement 

2.1  Waveform  Cube 

Full  waveform  data  is  not  necessarily  restrict¬ 
ed  to  the  digitized  waveform,  but  is  usually 
identified  with  waveform  signal  samples,  the 
essential  data  for  further  processing.  Addi¬ 
tional  data,  such  as  time  and  pointers  to  flight 
navigation  parameters  are  necessary  to  com¬ 
pute  geolocation  and,  finally,  to  create  the 
point  cloud.  The  size  of  the  mandatory  meta¬ 
data,  however,  is  much  smaller  than  the  size 
of  waveform  samples,  which  implies  that  the 
compression  is  mainly  considered  for  wave¬ 
form  samples.  Besides  the  vendor  specified 
formats  for  waveform  data  storage,  which  are 
usually  unknown,  there  are  independent  stan¬ 
dards  allowing  the  waveform  data  exchange, 
such  as  the  LAS  format  (ASPRS  2013);  note 
that  only  point  data  record  formats  starting 
from  LAS  vl.3  are  able  to  store  FWD,  and  the 
PulseWaves  tool  (Isenburg  2014)  or  SPDLib 
(SPDlib  2013)  have  this  capacity,  too.  The 
strategy  used  in  the  approach  proposed  here 


for  waveforms  compression  is  based  on  a  3D 
waveform  data  structure,  called  a  waveform 
cube  (Jozkow  et  al.  2015),  due  to  its  similarity 
to  the  image  cube  of  hyper- spectral  images.  A 
very  similar  idea  of  representing  waveforms 
in  a  volumetric  data  structure,  but  for  static 
terrestrial  scanning,  was  presented  earlier  by 
Stilla  &  Jutzi  (2008).  The  waveform  cube  is 
not  a  standard  or  a  file  format,  but  the  struc¬ 
ture  of  FWD  arrangement  that  maintains  the 
topology  of  waveform  samples  according  to 
the  data  acquisition  process  (Fig.  1).  The  three 
dimensions  of  the  waveform  cube  are:  flight 
direction  of  the  aircraft,  scan  line  (cross  flight 
direction),  and  the  direction  of  the  laser  pulse 
propagation  (waveform).  It  must  be  empha¬ 
sized  that  the  waveform  cube  is  not  a  geore- 
ferenced  structure,  such  as  an  orthophoto  or  a 
digital  terrain  model  (DTM)  grid,  i.e.  the  dis¬ 
tance  in  the  3D  space  between  any  two  ele¬ 
ments  of  the  cube  cannot  be  calculated  based 
in  the  cube  indices;  however,  the  topology  of 
waveform  samples  is  always  related  to  the  spa¬ 
tial  order  of  the  laser  pulses,  i.e.  the  sequence, 
as  they  are  acquired  in  time. 

While  the  formation  of  the  waveform  cube 
is  simple,  there  are  a  few  additional  aspects 
that  should  be  mentioned.  First,  the  outgoing 
pulse  is  also  digitized  as  part  of  the  wave¬ 
form  record,  since  this  information  is  essen¬ 
tial  for  waveform  decomposition.  There  are 
two  ways  to  compress  the  outgoing  waveform: 
either  jointly  with  the  return  waveform  or  in¬ 
dependently,  by  forming  another  waveform 


Fig- 1  :  Waveform  cube  structure. 
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cube.  Since  the  size  of  the  outgoing  waveform 
is  fixed,  the  second  option  is  preferred.  Sim¬ 
ilarly,  for  multi-wavelength  LiDAR  systems, 
waveform  cubes  can  be  formed  for  each  sen¬ 
sor;  note  the  correlation  among  the  different 
wavelength  waveform  could  be  potentially 
exploited  for  compression  in  the  future.  The 
second  aspect  is  the  object  space  complexity, 
which  has  a  paramount  effect  on  the  spatial 
correlation  of  the  waveforms.  Neighbouring 
waveforms  are  generally  similar  to  each  other 
over  open  and  slowly  changing  areas,  whereas 
in  built-up  areas,  such  as  urban  canyons,  their 


shape  can  change  abruptly,  resulting  in  a  less 
efficient  compression. 

The  third  aspect  is  the  waveform  cube  size 
which  is  defined  in  two  directions  by  the  wave¬ 
form  record  length  and  the  number  of  pulses 
per  scan  line.  Both  may  fluctuate  on  some  sys¬ 
tems,  in  which  case,  empty  waveform  samples 
or  records  can  be  inserted,  respectively.  The 
third  dimension,  however,  can  be  set  arbi¬ 
trarily,  ranging  from  a  few  scan  lines  to  all  the 
scan  lines  in  a  strip.  Given  the  spatial  extent 
and  the  computational  aspects,  a  nearly  equal 
size  in  the  horizontal  dimensions  is  preferred, 


(b) 


Fig.  2:  Test  waveform  cube  location:  (a)  Area  covered  by  both  the  Cl  and  C2  cubes,  Corbin,  Vir¬ 
ginia,  USA,  (b)  C3  cube,  Duck,  North  Carolina,  USA. 
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which  is  similar  to  the  standard  practice  of  til¬ 
ing  large  geospatial  data.  Note  that  some  sen¬ 
sors  use  multiple  waveform  digitizers,  requir¬ 
ing  the  use  of  multiple  cubes. 

In  summary,  the  idea  of  the  waveform  cube 
is  not  ideal  in  terms  of  implementation,  as  it 
may  not  be  directly  applicable  to  all  scanners; 
for  example,  waveform  sample  rearrangement 
may  be  needed.  Yet  in  a  statistical  sense,  the 
waveform  cube  provides  an  effective  way  to 
achieve  high  compression  performance  due  to 
its  ability  to  exploit  3D  correlation. 

2.2  Test  Data 

The  data  variability  greatly  affects  compres¬ 
sion  performance  manifested  by  the  value  of 
the  compression  ratio;  usually  data  with  a  low 
variation  can  be  compressed  with  higher  ra¬ 
tio  than  more  varying  data.  Consequently,  test 
data  for  the  assessment  of  compression  perfor¬ 
mance  should  be  chosen  carefully,  avoiding 
the  extreme  conditions  of  high  and  low  data 
complexity.  The  test  site,  shown  in  Fig.  2a, 
contains  a  mixture  of  topographic  elements, 
such  as  buildings,  road  infrastructure,  dense 
forest,  single  trees,  and  open  terrain.  To  sup¬ 
port  this  study,  two  waveform  cubes  were  ex¬ 
tracted  and  used  in  extensive  tests,  covering 
nearly  the  same  area  and  acquired  by  using 
two  different  LiDAR  systems  (Riegl  Q680i 
and  Riegl  Q780)  on  the  same  flight.  This  ex¬ 
plains  a  slightly  different  size  of  both  test 
waveform  cubes  of  504  x  1200  x  120  and  488 
x  1170  x  120  scan  lines,  waveforms  per  scan 
line,  and  samples  in  the  waveform  (/,  s,  w  di¬ 
mensions  in  Fig.  1),  for  the  first  (Cl)  and  the 
second  (C2)  cube,  respectively.  Additionally,  a 
3  km  single  strip,  C3  (Fig.  2b),  acquired  using 
a  Riegl  Q780  scanner,  was  processed,  allow¬ 
ing  to  test  the  algorithm  in  diverse  conditions, 
in  terms  of  topographical  objects.  Tests  for  C3 
data  included  both  emitted  and  returned  wave¬ 
form  cubes;  the  numbers  of  waveform  samples 
were  28  and  120,  respectively.  The  strip  was 
divided  into  12  cubes,  each  containing  496 
scan  lines,  and  the  number  of  pulses  was  1170 
per  scan  line. 

For  all  datasets,  the  signal  intensity  was 
sampled  at  1  ns  with  8  bit  resolution.  Thus,  the 
waveform  consisting  of  120  samples  represent 


a  vertical  range  of  36  m,  clearly  sufficient  to 
include  all  the  natural  and  man-made  objects 
in  these  areas. 


3  Compression  Strategy 

Similarly  to  our  previous  work,  only  lossy 
compression  methods  were  considered  be¬ 
cause  a  perfect  reconstruction  of  recorded 
FWD  is  practically  not  necessary,  as  dis¬ 
cussed  in  details  by  Jozkow  et  al.  (2015). 
Based  on  those  results,  it  was  concluded  that 
JPEG-2000  Standard  was  the  most  efficient 
among  the  tested  2D  strategies  of  waveform 
data  compression.  Here  the  objective  is  to  ex¬ 
tend  the  compression  from  2D  to  3D,  so  the 
goal  of  this  study  is  to  benefit  waveform  data 
compression  by  exploiting  the  correlation  of 
waveform  samples  in  each  of  three  dimen¬ 
sions:  along  flight  line,  scan  line,  and  wave¬ 
form  direction.  Extensions  of  the  JPEG-2000 
Part  2  (Taubman  &  Marcellin  2002)  intro¬ 
duce  a  multi-component  transform  result¬ 
ing  in  the  ability  to  compress  multi-band  im¬ 
ages.  In  short,  the  simplest  multi-component 
transform  first  applies  a  decorrelating  trans¬ 
form,  e.g.  ID  wavelet  transform,  to  each  pixel 
of  the  image  in  the  third  dimension,  and  then 
each  image  component,  e.g.  band,  follows 
the  JPEG-2000  Part  1  compression  schema. 
Due  to  the  Part  2  extension,  previous  JPEG- 
2000  restrictions  of  compression  only  single 
band  or  three  band  images,  as  RGB,  were  re¬ 
moved  and  the  possibility  of  applying  JPEG- 
2000  compression  to  hyper-spectral  images 
consisting  of  multiple  bands  became  avail¬ 
able  (Kulkarni  et  al.  2006).  Since  the  struc¬ 
ture  of  hyper-spectral  images  and  waveform 
cubes  are  identical,  both  data  types  can  be 
compressed  using  the  same  strategies.  The  ex¬ 
tension  of  the  JPEG-2000  Part  2  containing 
multi- component  transform  is  implemented 
only  in  a  few  specialized  software  packages, 
such  as  the  PICTools  Medical  SDK  for  com¬ 
pressing  volumetric  medical  scans  (AccuSoft 
2014),  LEADTOOLS  JPEG  2000  Image  Com¬ 
pression  SDK  (LEADTOOLS  2014),  Open- 
JPEG  library  (OpenJPEG  2014),  and  Kakadu 
Software  (Kakadu  Software  2013)  which,  in 
version  7.2,  was  used  in  this  study.  In  order 
to  investigate  the  variability  of  the  waveform 
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cube  affecting  compression,  additional  oper¬ 
ations  were  also  performed  prior  to  Kakadu 
compression.  The  flowchart  of  all  performed 
operations  is  presented  in  Fig.  3  and  discussed 
below  in  details. 

Since  the  actual  range  of  waveform  sample 
intensities  (values)  varies  for  each  band  of  the 
waveform  cube  and,  thus,  may  adversely  af¬ 
fect  the  decorrelating  transforms,  waveform 
cube  bands  were  normalized  before  applying 
these  transforms.  In  this  study,  three  variants 
of  the  band-wise  normalization  were  tested: 

•  zero -mean  (ZM),  where  the  mean  value  of 
the  band  was  subtracted  from  the  waveform 
samples  for  each  band, 

•  unit-variance  (UV),  where  beside  the  mean 
removal,  sample  normalization  was  applied 
so  that  each  band  had  unit  variance, 

•  no  normalization,  i.e.  using  the  original 
cube  data  (OC)  for  comparison  purposes. 

•  JPEG-2000  contains  the  full  lossy  com¬ 
pression  scheme,  including  (1)  the  trans¬ 
form  engine  for  data  decorrelation,  (2)  the 
quantization  engine  for  data  loss/reduction, 
and  (3)  the  bit  encoding  engine  for  lossless 
compression.  Nevertheless,  the  influence 
of  using  data  decorrelations  prior  to  JPEG- 
2000  is  of  interest  since  better  decorrela¬ 
tion  usually  results  in  a  better  compression 
rate.  Therefore,  as  a  preprocessing  step, 
three  different  decorrelation  methods  were 
tested: 


Fig.  3:  Flowchart  of  operations  executed  dur¬ 
ing  experiments. 


•  Karhunen-Loeve  transform  (KLT)  (Kar- 
hunen  1947,  Quirk  2003), 

•  wavelet  transform  (WLT)  using  Cohen- 
Daubechies-Feauveau  5/3  wavelet  (CDF 
5/3)  (Cohen  et  al.  1992), 

•  no  transform  (NOT)  for  comparison  pur¬ 
poses. 

KLT  was  applied  to  the  cube  regarded  as 
a  discrete  vector  stochastic  process  (Quirk 
2003).  This  means  that  the  whole  waveform 
cube  should  be  considered  as  a  single  large 
image  which  is  the  result  of  ‘gluing’  together 
slices  of  the  cube  along  the  waveform-scan 
line  plane.  Considering  dimensions  of  the 
cube  as  /,  5,  and  w  according  to  Fig.  1,  the  sin¬ 
gle  large  image  B  will  have  the  size  of  w,  and 
Is  (shown  on  the  right  side  in  Fig.  4)  and  then 
KL-transform  is  calculated  as: 

C  =  Kt  •  B  (1) 

where: 

™\  \  ^1,2  '  ’ '  Kw 


kw  i  kw2  •  •  •  kw,w 

is  the  KLT  matrix  whose  columns  are  the  ei¬ 
genvectors  of  the  covariance  matrix  of  image 
B.  Note  that  covariance  of  the  image  can  be 
calculated  two  ways,  depending  on  whether 
columns  or  rows  of  the  image  are  treated  as 
random  variables.  In  this  work,  the  first  meth¬ 
od  was  applied.  C  is  the  KL-transformed  sin¬ 
gle  large  image  which  is  reshaped  backward 
into  the  cube,  and  then  subjected  to  the  subse¬ 
quent  operations.  The  key  of  such  use  of  KLT 
is  the  data  decorrelation,  resulting  in  packing 
the  energy  of  the  signal  mostly  in  the  first  few 
bands  (Quirk  2003,  Vaidyanathan  1998).  This 
could  benefit  the  JPEG-2000  multi-band  com¬ 
pression,  where  many  compressed  bands  may 
contain  almost  no  energy.  KLT  is  reversible, 
which  means  the  original  image  B  can  be  re¬ 
constructed  based  on  the  transformed  data  C 
and  the  transformation  matrix  K : 

B  =  KC  (2) 

The  inverse  KLT  was  applied  for  the  recon¬ 
struction  (decompression)  process,  which  in- 
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eludes  data  reshaping  between  waveform  cube 
and  single  large  image,  but  in  the  reverse  order 
as  in  the  compression  process. 

In  the  second  test,  ID  WLT  was  applied  to 
each  waveform  separately  (Fig.  4).  The  results 
of  WLT  are  low-  and  high-frequency  compo¬ 
nents,  denoted  here  as  Lo  and  Hi.  Considering 
ID  WLT  as  applied  in  this  work,  both  Lo  and 
Hi  components  have  the  same  length,  equal  to 
half  of  the  length  of  the  original  signal  (wave¬ 
form).  Note  that  the  low-frequency  compo¬ 
nent  contains  most  of  the  original  signal  en¬ 
ergy,  thus  considering  such  transform  for  all 
waveforms,  the  energy  would  be  packed  into 
one  half  of  the  cube  bands.  Similarly,  the  Lo 
component  can  be  subjected  to  another  WLT 
resulting  in  two  new  components  having  half 
of  the  length  of  the  parent,  so  the  original  sig¬ 


nal  content  will  be  included  only  in  one  quar¬ 
ter  of  the  cube.  This  sequential  WLT  known 
as  multi-resolution  analysis  (MRA)  (Mallat 
1989)  will  finally  result  in  packing  the  energy 
of  the  signal  in  the  first  few  bands,  similar  to 
KLT  (see  the  length  of  Lo  component  in  Fig.  4). 
The  number,  n,  of  possible  levels  of  MRA  de¬ 
pends  on  the  length  of  the  original  signal,  n  is 
less  than  log2  ( m ),  where  m  is  the  length  of  the 
signal;  for  example,  for  a  waveform  length  of 
120,  used  in  this  study,  the  maximum  MRA  is 

7  levels;  however,  only  three  levels  were  used 
here  to  avoid  edge  effects  and  length  extension 
of  the  WLT  components.  Similarly  to  KLT, 
WLT  is  totally  reversible. 

From  the  perspective  of  compression  of  an 

8  bit  waveform  cube,  data  normalization  and 
transforms  contradict  to  the  idea  of  data  re- 
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Fig.  4:  Wavelet  transform  applied  on  the  large  image  of  the  whole  waveform  cube. 
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duction  since  these  operations  result  in  a  lar¬ 
ger  data  size  due  to  a  conversion  from  inte¬ 
gers  into  real  numbers,  usually  in  32  bit  or  64 
bit  representation.  Additionally,  some  of  the 
compressing  tools  do  not  allow  floating  point 
numbers  as  input  pixel  (sample)  values.  For 
example,  the  Kakadu  Software  accepts  only 
32  bit  input.  Therefore,  the  quantization  en¬ 
coding  is  needed  to  allow  mapping  floating 
point  numbers  into  integers.  Note  that  this 
process  is  invertible,  known  as  quantization 
decoding,  but  provides  no  perfect  reconstruc¬ 
tion.  The  experiments  on  the  test  data  showed 
that  an  8  bit  range  would  be  too  short  to  avoid 
large  quantization  errors,  and,  thus,  more  bits 
for  the  quantization  base  are  needed.  In  this 
experiment,  a  linear  quantizer  with  a  28  bit 
base  length  was  used  providing  much  larger 
dynamic  range  than  that  of  the  original  data 
(8  bits).  The  amount  of  the  introduced  quan¬ 
tization  noise  and  other  data  distortion,  such 
as  numerical  errors  of  data  normalization  and 
transforms  not  caused  by  JPEG-2000  com¬ 
pression,  was  empirically  evaluated.  First,  the 
test  waveform  cubes  were  subjected  to  for¬ 
ward  processing,  including  data  normaliza¬ 
tion,  transform,  and  quantization  encoding; 
and  then,  the  inverse  operation,  i.e.  quantiza¬ 
tion  decoding,  inverse  transform,  and  reverse 
data  normalization,  was  carried  out.  The  ob¬ 
served  maximal  absolute  difference  between 
the  value  of  the  original  and  reconstructed 
waveform  sample  was  on  the  level  of  le-4  that 
equals  to  0  in  integer  terms. 

The  results  of  the  investigation  of  image 
based  waveform  cube  compression  (Jozkow 
et  al.  2015)  showed  that  the  best  compres¬ 
sion  performance  was  obtained  for  the  set  of 
images  where  the  single  image  was  formed  by 
all  waveforms  of  a  single  scan  line  (according 
to  5  and  w  dimensions  in  Fig.  1).  Therefore, 
the  waveform  cube  was  rotated  prior  to  multi¬ 
band  compression  in  the  Kakadu  Software, 
thus  the  dimension  /  of  the  cube  was  treated  as 
the  band  during  JPEG-2000  multi-component 
transform. 

In  the  case  of  lossy  compression,  the  user 
can  decide  partially  about  the  amount  of  data 
degradation  and  compression  ratio.  Depend¬ 
ing  on  the  implementation,  a  quality  factor  is 
used  to  control  the  desired  compression  ratio. 
In  the  Kakadu  Software,  this  user  input  pa¬ 


rameter  is  the  bits  per  pixel  ratio,  i.e.  the  aver¬ 
age  number  of  bits  for  a  single  pixel  in  the 
compressed  file.  Obviously,  this  value  must  be 
smaller  than  the  bit  depth  in  the  original  image 
in  order  to  gain  a  reduction  in  the  file  size,  but 
a  smaller  ratio  means  a  larger  data  distortion 
due  to  compression.  The  resulting  ratio  might 
be  slightly  different  from  the  value  given  by 
the  user  because  the  compression  strength  de¬ 
pends  also  on  the  inherent  parameters  of  the 
data.  Since  the  original  waveform  data  is  8 
bits,  experiments  were  executed  for  20  para¬ 
meters,  ranging  from  0.4  to  8  bits  with  a  step 
size  of  0.4  bits.  Note  that  Kakadu  compressed 
JPEG-2000  file  might  include  several  recon¬ 
struction  qualities,  in  other  words,  one  file 
may  contain  data  compressed  with  different 
ratios  at  the  same  time,  but  this  option  was  not 
tested  during  this  investigation. 

4  Performance  of  Waveform  Data 
Compression 

Compression  performance  might  be  evalu¬ 
ated  from  different  perspectives,  such  as  the 
achieved  compression  ratio  and  reconstruc¬ 
tion  error.  The  compression  ratio  is  defined 
by  the  percentage  of  the  compressed  file  size 
with  respect  to  the  original  one.  In  the  case 
of  image  compression,  the  bits  per  pixel  ra¬ 
tio  (BPP)  or  the  bits  per  pixel  per  band  ratio 
(BPPPB)  are  frequently  used  to  describe  the 
compression  ratio,  depending  whether  a  sin¬ 
gle-  or  a  multi-band  image  is  compressed.  The 
BPP  value  is  the  number  of  bits  used  to  store 
a  single  pixel  in  the  compressed  image.  Due 
to  the  similarity  of  multi-band  images  and 
waveform  cubes,  the  BPPPB  was  used  in  this 
study.  Note  that  a  pixel  of  the  image  cube  is 
equivalent  to  a  waveform  sample  in  the  wave¬ 
form  cube.  Knowing  the  BPPPB  ratio  for  the 
original  and  compressed  cubes,  the  compres¬ 
sion  ratio  or  percentage  ratio  can  be  easily 
calculated.  The  compression  ratio  is  also  af¬ 
fected  by  the  file  header,  or  metadata,  essen¬ 
tial  for  decompression.  This  data  is  kept  in 
the  compressed  file,  increasing  its  size.  The 
final  BPPPB  ratio  was  calculated  on  the  basis 
of  the  file  size  produced  by  the  Kakadu  Soft¬ 
ware  and  the  number  of  waveform  samples  in 
the  waveform  cube.  Note  that  the  size  of  oth- 
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er  data,  mandatory  for  reconstruction,  such  as 
mean  values  of  bands  in  the  case  of  ZM  data 
or  K  matrix  in  the  case  of  KLT  transform  and 
quantization  parameters,  were  not  included  in 
this  calculation.  The  omission  of  these  param¬ 
eters  in  the  size  calculations  does  not  change 
the  BPPPB  ratio  significantly,  because  the  size 
of  these  parameters  is  much  smaller  than  the 
size  of  the  compressed  cube. 

The  performance  of  lossy  compression 
methods  is  related  to  the  data  degradation,  the 
distortion  (noise)  introduced  due  to  quantiza¬ 
tion  included  in  the  compression  process.  Re¬ 
construction  noise  (error)  can  be  measured  by 
different  parameters,  such  as  signal  to  noise 
ratio  (SNR),  peak  signal  to  noise  ratio  (PSNR) 
(Vaidyanathan  1993),  or  just  by  giving  sim¬ 
ple  statistics  of  the  differences  between  recon¬ 
structed  and  original  data.  In  this  work  data 
distortion  was  measured  by  the  SNR,  which 
is  based  on  the  variance  of  waveform  samples 
and  their  differences: 

_2 

SNR  =  10-  log10  —j—  (3) 

<K-r 

where 

o]  -  variance  of  all  the  original  waveform 
samples, 

ol_r  -  variance  of  waveform  sample  differ¬ 
ences  between  original  and  compressed 
data. 

For  a  low  data  degradation,  the  SNR  is 
large,  and  it  approaches  infinity  for  a  perfect 
reconstruction.  Note  that  the  calculated  SNR 
describes  only  quantization  noise  of  the  origi¬ 
nal  waveform  signal. 

Another  aspect  of  compression  perfor¬ 
mance  is  the  computational  cost,  the  compres¬ 
sion  and  decompression  speed  and  use  of  com¬ 
puter  resources.  In  our  previous  work  (Jozkow 
et  al.  2015),  it  was  concluded  that  2D  JPEG- 
2000  based  compression  is  fast  enough  to  sup¬ 
port  waveform  compression  during  the  acqui¬ 
sition  process  in  the  sensor  system,  as  wave¬ 
forms  forming  single  scan  lines  might  be  com¬ 
pressed  progressively.  Similarly,  waveform 
cubes  might  be  compressed  progressively  by 
the  approach  presented  in  this  work.  However, 
the  additional  transforms  for  data  decorrela¬ 
tion  introduced  in  this  experiment,  e.g.  KLT, 
make  the  algorithm  much  more  complex  and, 


consequently,  resulting  in  a  much  slower  ex¬ 
ecution  than  the  2D  compression.  Obviously, 
any  computation  on  larger  datasets  like  wave¬ 
form  cubes  requires  much  more  memory  re¬ 
sources  than  computation  performed  on  a 
small  part  of  this  data  like  the  single  image 
slice.  These  issues  were  not  considered  in  this 
work  in  evaluating  the  computational  expens¬ 
es  of  the  compression  performance. 

Finally,  the  impact  of  compression  noise 
can  be  evaluated  looking  at  the  results  of  sub¬ 
sequent  waveform  data  post  processing  tasks, 
but  executed  on  the  decompressed  data.  For 
example,  the  Gaussian  waveform  decomposi¬ 
tion  should  result  in  the  same  number  of  de¬ 
tected  echoes  with  insignificantly  different 
parameters  from  those  obtained  in  processing 
the  original  uncompressed  FWD.  Note  that 
even  well-established  waveform  decomposi¬ 
tion  methods  produce  varying  results,  just  as 
the  number  of  parameters  used  to  describe  the 
components  can  be  different,  for  example,  3 
and  4  (Chauve  et  al.  2007),  or  even  5  (Laky 
et  al.  2010).  Based  on  the  earlier  investiga¬ 
tion  (Jozkow  et  al.  2015),  it  was  concluded  that 
the  SNR  of  above  30  dB  -  35  dB  in  typical 
airborne  LiDAR  data  assures  an  acceptable 
waveform  reconstruction  error  which  will  not 
cause  significant  changes  in  waveform  shape 
and,  consequently,  does  not  affect  the  results 
of  subsequent  FWD  processing,  in  particular 
waveform  decomposition. 

5  Results  and  Discussion 

Numerical  experiments  were  performed 
with  all  combinations  of  the  three  decorre¬ 
lation  techniques  (OC,  ZM,  UV)  and  three 
transforms  (NOT,  KLT,  WLT)  at  20  differ¬ 
ent  user  specified  compression  ratios  for  two 
test  waveform  cubes  Cl  and  C2.  To  discuss 
and  analyze  the  effects,  experimental  results 
are  visualized  by  showing  the  SNR  as  a  func¬ 
tion  of  the  obtained  BPPPB  in  Fig.  5.  For  com¬ 
parison  purposes,  results  obtained  for  the 
same  data  but  using  the  earlier  proposed  ap¬ 
proach,  based  on  JPEG-2000  compression  of 
waveforms  arranged  in  the  set  of  2D  images 
(Jozkow  et  al.  2015)  was  added  to  the  figures. 
It  should  be  also  explained  why  2D  compres¬ 
sion  did  not  result  in  large  SNR  or  BPPPB  ra- 
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OC,  NOT  -  OC,  KLT  OC,  WLT 

Data  type  and  transform:  ZM,  NOT -  ZM,  KLT -  ZM,  WLT  - 

UV,  NOT  .  UV,  KLT  UV,  WLT 


OC,  NOT  -  OC,  KLT  OC,  WLT 

Data  type  and  transform:  ZM,  NOT -  ZM,  KLT -  ZM,  WLT 

UV,  NOT  .  UV,  KLT  UV,  WLT 


and  waveform:  OC,  NOT,  emitted  OC,  KLT,  emitted  2D,  emitted 

Fig.  5:  SNR  as  a  function  of  BPPPB  ratio,  BPPPB  =  bits  per  pixel  per  band  ratio. 
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tio.  For  the  user  input  ratios  2.4  bits  and  larger, 
the  obtained  SNR  was  always  similar,  about 
35  dB  and  41  dB  for  Cl  and  C2,  respectively, 
as  well  as  similar  was  the  BPPPB  ratio,  about 
1.7  and  2  bits  for  Cl  and  C2,  respectively.  This 
explains  the  higher  dynamic  of  JPEG-2000 
based  3D  compression  than  2D  compression 
of  the  waveform  cube. 

Comparing  results  obtained  for  Cl  and  C2 
test  cubes,  it  is  clearly  seen  that  the  impact  of 
using  different  sensors  for  collecting  FWD 
for  the  same  area  is  insignificant;  a  slightly 
larger  (maximally  5  dB)  SNR  was  obtained 
for  the  test  cube  C2.  The  most  likely  reason  is 
the  lower  internal  complexity  of  cube  C2,  as 
the  compression  of  images  containing  less  de¬ 
tails  results  in  lower  reconstruction  noise  for 
the  same  compression  rate.  Among  the  used 
transforms,  the  worst  SNR  was  obtained  for 
WLT.  This  could  be  explained  with  the  fact 
that  Lo  and  Hi  components  are  usually  in  very 
different  ranges  and,  thus,  a  non-linear  quan¬ 
tizer  might  be  more  appropriate  for  preserv¬ 
ing  better  dynamic  range  of  the  quantized 
component  values  prior  to  JPEG-2000  com¬ 
pression.  Differences  between  the  other  two 
transforms,  KLT  and  NOT,  are  maximally  of 
about  5  dB  for  the  same  BPPPB  ratio,  where 
higher  SNR  was  obtained  for  KLT  in  the  case 
of  small  ratios  and  NOT  in  the  case  of  large 
ratios.  Considering  that  the  used  KLT  is  adap¬ 
tive  (needs  to  be  calculated  for  every  dataset), 
and  therefore  highly  computational  expensive, 
and  the  improvement  of  the  SNR  by  a  few  dB 
only  for  small  BPPPB  ratios  compared  to  the 
case  of  using  no  transform  (NOT),  it  is  clearly 
not  beneficial  in  practice.  Obviously,  a  fixed 
KLT  matrix  for  similar  datasets  might  be  used 
to  reduce  the  number  of  computations,  but  it 
is  extremely  difficult  to  find  representative 
datasets  of  FWD  to  create  a  fixed  KLT  base 
(Jozkow  et  al.  2015).  The  last  aspect  of  the  in¬ 
vestigated  approach  is  the  data  normalization 
method.  The  worst  SNR  was  obtained  for  UV. 
Differences  between  OC  and  ZM  are  insignif¬ 
icant,  which  implies  that  data  do  not  require 
any  normalization  and  the  intensities  of  the 
original  waveform  samples  are  suitable  for  the 
compression. 

Based  on  the  Cl  and  C2  results,  compres¬ 
sion  experiments  with  the  C3  dataset  were  ex¬ 
ecuted  only  for  the  best  performing  data  nor¬ 


malization  (OC)  and  transforms  (NOT,  KLT), 
resulting  in  similar  performance  for  compress¬ 
ing  received  waveforms.  SNR  obtained  for 
C3,  however,  was  about  10  dB  larger  than  for 
the  Cl  and  C2  cubes.  This  may  be  explained 
by  the  simpler  object  complexity  of  the  C3 
area.  Similarly,  a  difference  of  performance 
between  compressing  cubes  of  emitted  and 
received  waveforms  was  observed,  as  a  large 
number  of  zero  samples  in  the  received  cube 
resulted  in  higher  SNR  for  large  compres¬ 
sion  ratios.  In  contrast,  the  strong  similarity 
of  the  emitted  waveforms  allowed  for  higher 
SNR  for  small  compression  ratios.  This  close 
similarity  of  the  outgoing  waveforms  was  also 
exploited  by  applying  an  additional  decorre- 
lating  transform.  For  example,  KLT  applied  to 
the  emitted  cube  resulted  in  higher  SNR,  es¬ 
pecially  for  large  compression  ratios,  than  in 
the  case  where  the  preprocessing  transform 
was  not  applied  (NOT).  Since  the  outgoing 
waveforms  change  very  little,  instead  of  us¬ 
ing  adaptive  KLT,  a  fixed  KLT  may  be  applied 
to  reduce  the  computational  expenses.  Finally, 
comparing  results  of  2D  with  3D  compression 
approaches,  the  same  observations  can  be  not¬ 
ed  as  for  the  smaller  cubes  Cl  and  C2.  Com¬ 
pression  performance  differences  for  emitted 
and  received  cubes  in  the  2D  approach  follow 
the  same  pattern  as  for  3D  approach. 

Comparing  waveform  compression  results 
of  the  earlier  2D  and  here  proposed  3D  meth¬ 
ods,  both  based  on  the  JPEG-2000  Standard, 
the  difference  is  not  significant;  for  example, 
for  unnormalized  and  untransformed  data,  the 
3D  approach  for  small  BPPPB  ratio  gives  a 
slightly  larger  SNR  than  the  2D  approach,  the 
difference  being  about  5  dB  -  10  dB.  Clearly, 
the  flexibility  of  the  3D  approach  to  adjust  the 
data  degradation  and  compression  ratio  is  an 
obvious  advantage. 


6  Conclusions 

This  work  investigated  the  feasibility  of  com¬ 
pressing  waveform  cube  using  multi-com¬ 
ponent  JPEG-2000  extension.  The  tested  ap¬ 
proaches  included  additional  computations, 
such  as  data  normalization  and  transforms 
prior  to  JPEG-2000  compression. 
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Based  on  the  numerical  experiments  per¬ 
formed  on  three  waveform  cubes,  it  was  con¬ 
cluded  that  in  relation  to  2D  JPEG-2000  based 
compression,  the  multi-component  extension 
is  more  flexible,  because  the  user  can  chose 
between  a  larger  range  of  compression  ra¬ 
tios  and  select  larger  file  sizes  to  obtain  very 
low  data  loss,  which  was  not  possible  for  the 
2D  approach.  For  larger  compression  ratios 
(small  BPPPB  ratio),  however,  both  2D  and 
3D  approaches  result  in  similar  performance 
in  terms  of  data  degradation  and  reduction  of 
the  file  size.  Note  that  for  both  approaches, 
this  similar  performance  was  obtained  for  the 
same  cube  orientation  where  bands  (images) 
were  formed  from  waveforms  representing 
single  scan  lines,  offering  more  flexibility  for 
the  practical  use  where  the  same  compression 
tool  might  be  used  with  two  different  vari¬ 
ants  depending  on  the  available  computational 
power.  More  importantly,  both  single  imag¬ 
es  and  waveform  cubes  can  be  then  progres¬ 
sively  compressed  according  to  the  waveform 
data  acquisition  order.  The  advantage  of  2D 
approach  is  speed,  but  the  disadvantage  is  the 
low  dynamic  range  and  the  inability  to  achieve 
a  large  SNR.  Multi- component  compression  is 
slower,  but  gives  the  user  more  choices  on  de¬ 
ciding  about  the  amount  of  data  degradation. 

The  used  implementation  of  the  JPEG-2000 
Standard  with  wavelet-based  multi-compo¬ 
nent  transform  performed  well  in  decorrelat- 
ing  the  original  waveform  cube  data.  Addi¬ 
tional  data  normalization  or  transform  of  the 
original  waveform  cube  did  not  show  signifi¬ 
cant  improvements  in  3D  compression  perfor¬ 
mance,  and  only  caused  extra  computational 
costs. 

Finally,  one  more  advantage  of  using  JPEG- 
2000  Standard  for  compressing  waveforms 
in  both  approaches  is  the  possibility  of  keep¬ 
ing  different  reconstruction  levels  in  one,  but 
larger  file.  It  can  be  useful  for  data  distribu¬ 
tion  with  different  distortion  and  compression 
levels  depending  on  the  application  require¬ 
ments. 
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